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Preface 



Abu Ja'far Muhammad ibn Musa al-Khwarizmi 
(whose name gives us the word 'algorithm') wrote 
an algebra textbook which included much of what is 
still regarded as elementary algebra today. The title 
of his book was Hisab al-jabr w'al-muqabala. The 
word al-jabr means 'restoring', referring to the pro- 
cess of moving a negative quantity to the other side of 
an equation; the word al-muqabala means 'compar- 
ing', and refers to subtracting equal quantities from 
both sides of an equation. Both processes are famil- 
iar to anyone who has to solve an equation! The word 
al-jabr has, of course, been incorporated into our lan- 
guage as 'algebra'. 

In a similar vein. Doctor Johnson gave this definition of "algebra" in his Dic- 
tionary of 1755: 

This is a peculiar kind of arithmetick, which takes the quantity sought, 
whether it be a number or a line, or any other quantity, as if it were 
granted, and by means of one or more quantities given, proceeds by 
consequence, till the quantity at first only supposed to be known, or 
at least some power thereof, is found to be equal to some quantity or 
quantities which are known, and consequently itself is known. 

Since the time of Al-Khwarizmi and Johnson, the subject of algebra has changed 
considerably. Firstly, we no longer restrict ourselves to considering just numbers; 
the variables and symbols in our equations may be vectors, matrices, polynomi- 
als, sets, or permutations. Secondly, the way we look at these equations has also 
changed. As far as possible, we don't care what the variables stand for, but only 
the "laws" that they obey (associative, distributive, etc.); so that we can prove 
something about a system satisfying certain laws which will apply to systems of 
numbers, matrices, polynomials, etc. We sometimes refer to this as "abstract al- 
gebra". 




iii 



iv 

These notes are intended for the course MAS 11 7, Introduction to Algebra, at 
Queen Mary, University of London. The course is to be given for the first time in 
the spring semester of 2007. 

The course is intended as a first introduction to the ideas of proof and abstrac- 
tion in mathematics, as well as to the concepts of abstract algebra (groups and 
rings). The Undergraduate Studies Handbook says: 

This module is an introduction to the basic notion of algebra, such as 
sets, numbers, matrices, polynomials and permutations. It not only 
introduces the topics, but shows how they form examples of abstract 
mathematical structures such as groups, rings, and fields and how al- 
gebra can be developed on an axiomatic foundation. Thus, the notions 
of definitions, theorem and proof, example and counterexample are 
described. The course is an introduction to later modules in algebra. 

The course replaces the earlier course Discrete Mathematics, with which it 
shares some material. But since it is a new course, 1 have re-written the notes 
from scratch. Of course, these notes are not a substitute for the lectures! 

The exercises at the ends of the chapters vary in difficulty from routine to 
challenging. To a first approximation, the easier exercises come first. 

If you enjoyed this course, the next step is MAS201, Algebraic Structures I. 
You can also find a set of notes for this course on my web page. 

This set of notes is a sUghtly revised version of the notes which were available 
during the course. I am grateful to Matilda Okungbowa for a number of correc- 
tions. 

Note: The pictures and information about mathematicians in these notes are 

taken from the St Andrews History of Mathematics website: 

http : //www-groups . dcs . st-and . ac . uk/~history/index . html 



Peter J. Cameron 
25 June 2007 
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Chapter 1 

What is mathematics about? 



There is a short answer to this question: mathematics is about proofs. In any 
other subject, chemistry, history, sociology, or anything else, what one expert says 
can always be challenged by another expert. In mathematics, once a statement is 
proved, we are sure of it, and we can use it confidently, either to build the next 
part of mathematics on, or in an application of mathematics. 

1.1 Some examples of proofs 

In this part of the course we are going to talk about how to prove things. Let us 
start with an easy theorem. 

Theorem 1.1 Let nbe a natural number Then n^ is odd if and only ifn is odd. 

If you know what the words in the theorem mean, you might try a few cases, 
to get a feel for what the theorem is about: 
1 is odd 1^ = 1 is odd 
2 is even 2^ = 4 is even 
3 is odd 3^ = 9 is odd 
and so on. It seems to work. But this is not yet a proof; we are not convinced that 
if you went on far enough, you might find a number for which the theorem was 
not true. 

First let us read the theorem more carefully. 

Natural number This means one of the counting numbers, 0, 1 , 2, 3, 4, (Ar- 
guments still occur among mathematicians about whether should count as a nat- 
ural number or not. This is just a matter of names, and doesn't affect the theorem 
very much. We will say that is a natural number.) 
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If and only if We will come back to this later. For now, it means that, for any 
value of n, either the two statements "n is odd" and "n^ is odd" are both true, or 
they are both false. In other words, 

• if n is odd, then is odd; 

• if rp- is odd, then n is odd. 

This shows us that we have two things to show, in order to prove the theorem. 
The first one looks fairly straightforward, but the second seems more difficult. But 
we can turn it round into something simpler. The statement 

if rp' is odd, then n is odd 

is logically the same as the statement 

if n is even, then rp- is even. 

So we have to prove the two statements: 

• if n is odd, then rp- is odd; 

• if n is even, then rp' is even. 
So let's try to prove them. 

We have one more thing to consider. What are even and odd numbers, math- 
ematically speaking? An even number is one which is divisible by 2 exactly; in 
other words, n is even if it can be written as n = 2k for some natural number k. 
An odd number is one which leaves a remainder of 1 when divided by 2; in other 
words, n is odd if it can be written as n = 2^ + 1 for some number k. 

So to prove the first statement, we assume that n is an odd number, and have 
to show that is an odd number. That is, we assume that n = 2k+l for some 
natural number k. Then 

= {2k+lf ^4k^ + 4k+l^2{2k^ + 2k) + l^2m+l, 

where m = 2k^ + 2k. So is odd. 

For the second statement, assume that n is even, that is, n = 2k for some natural 
number k. Then 

= {2kf = Ak^ = 2{2k^) = 2m, 

where m = 2k^; so is even. 

Now we have finished the proof, and we are sure that the theorem is true for 
all natural numbers n. 
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Now let's use this theorem as a 
building block in a very famous 
theorem, proved by Pythagoras, 
who has some claim to be the first 
mathematician ever (that is, the first 
person to insist that mathematical 
statements must have proofs). It 
was Pythagoras who invented the 
words "mathematics" and 
"theorem". 

Theorem 1.2 The number y/2 is irrational. 

First we have to examine what the theorem means. The number ^/2 is a pos- 
itive real number x such that — 2. A rational number is a number that can be 
expressed as a fraction a/b, where a and b are integers, that is, natural numbers or 
their negatives. 

Now my calculator tells me that \/2 = 1.414213562. If this is right, then 
Pythagoras is wrong, because this means that 

^ _ 1414213562 
~ 1000000000" 

But it turns out that the calculator is wrong, because it also tells me that 

(1.414213562)2 = 1.999999998944727844, 

which is close to 2 but not exactly 2. Pythagoras claims that, no matter how 
accurately the calculator does the sum and to how many places of decimals it 
expresses the answer, it will never get the exact value of ^/2. 

So how did Pythagoras prove his theorem? He used another important tech- 
nique: 

Proof by contradiction If I am trying to prove a statement P, I have succeeded 
if I can show that the assumption that P is false leads to a contradiction, a logical 
absurdity. For this shows that P is not false, that is, it is true. 

So we prove Pythagoras's theorem by contradiction; we assume the falsity of 
what we are trying to prove and head for a contradiction. That is, we assume that 




V2 is rational. 



4 



CHAPTER 1 . WHAT IS MATHEMATICS ABOUT? 



That is, 

n 

for some natural numbers m and n. Now, in a fraction like this, if there is a 
common factor of m and n, we can divide it out, and assume that they have no 
common factor. (For example, = |-) 

Now take our equation. Square roots are awkward; it usually simplifies an 
equation if you can get rid of them. We can easily do this by squaring both sides 
of the equation, to get 

2 

nr 



or in other words. 



2 o 2 

m =2n . 



This equation tells us that is even, since it is 2k where k = n^. Now we are able 
to use Theorem 1.1, since we already proved this. Since np- is even, necessarily m 
is even; say m = 2p for some natural number p. Substituting this into the equation 
gives 



and cancelling a factor 2 gives 



4/ = , 



o 2 2 

2p =n . 



Now we can "do it again". The last equation shows that is even, so that n is 
even, say n = 2q. So our original fraction for \/2 is 

x/2=^ = ^. 
n 2q 

We can cancel the 2 to get a simpler fraction for y^. 

But stop and remember what we are doing. We started off by saying that we 
can assume that m and n have no common factor, and we ended up with their 
having a common factor of 2. So we have reached a contradiction. 

According to the principle of proof by contradiction, our assumption that a/2 
is rational must be wrong, so that \/2 is irrational, as Pythagoras claimed. 

Now let us have another famous example of a proof by contradiction. We will 
prove that the prime numbers go on for ever; there is no largest prime. (Recently, 
a computer search found a previously unknown prime number bigger than any 
others found so far. A journalist got the idea that they had found "the largest 
prime number", and phoned one of my colleagues for a comment. What would 
you say if this happened to you?) 
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This beautiful proof was discovered 
by the Greek geometer Euclid, who 
wrote one of the world's most 
successful textbooks ever, which 
was used for nearly two thousand 
years. 




Theorem 1.3 There are infinitely many prime numbers. 

A prime number is a natural number which is divisible only by itself and 1. 
So 2, 3, 5 and 7 are prime numbers; 4 is not, since 4 = 2x2. By convention, we 
say that 1 is not a prime number, even though it satisfies the condition of having 
no divisors except itself and 1 ; this is just a convention, and we will see the reason 
for it later. Now if the number n is not prime, it must be divisible by some prime 
number smaller than n. (Again, we will see why later. This is not meant to be 
obvious!) 

We prove Euclid's theorem by contradiction. That is, we assume that there are 
only finitely many prime numbers. Then we can make a list of prime numbers: 

are all the prime numbers. 

Let n be the number that we get when we multiply all of these primes together 
and add 1 : 

« = PlP2/?3---PA:+l- 

Now there are two cases to consider: either n is prime, or it is not. We need to 
show that either case leads us to a contradiction. 

Case n is prime: In this case, since . . . are all the primes, n must be 
one of them. But this is impossible, since n is bigger than any of these primes. 
(Remember how we formed n.) 

Case n is not prime: Then n must have a prime factor, which must be one 
of the primes pi,...,pk. But n is the product of all the primes plus one; so if we 
divide it by any of the primes pi , . . . ,/7jfc, we get a remainder of one. So this case 
is also contradictory. 

So, again according to the principle of proof by contradiction, the assumption 
that there are only finitely many primes must be wrong; so there must be infinitely 
many primes. 
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1.2 Some proof techniques 

Here are some words you'll find in statements you are asked to prove. 

If, implies, sufficient The three statements 

IfA, then5 

A implies B 

A is a sufficient condition for B 

all have the same meaning. They mean, "if A is true, then B is true". 

Look more closely at this. How could this statement fail to be true? The only 
way it could fail is if A is true and B is false. (If A is false, then the statement is 
correct no matter whether B is true or false.) This seems a bit odd, sometimes, so 
let us take an everyday example. Suppose I say to you, "If it is fine tomorrow, we 
will go for a picnic." The only situation in which my statement is false is if it is 
fine tomorrow and we don't go for a picnic; if it rains tomorrow, my statement is 
technically correct (though maybe not helpful!) 

So how do we prove "if A, then 5"? The obvious way is to assume that A is 
true, and deduce that B must be true. Look back at our proof of "if n is even, then 
is even" in the last section. We assume that n is even and prove that is even. 

Only if, is implied by, necessary This is exactly the reverse. The three state- 
ments 

B only if A 

A is implied by B 

A is a necessary condition for B 

all mean the same as "if B, then A". 

The proof strategy, then, is to assume that B is true, and deduce that A must be 
true. 

If and only if, equivalent, necessary and sufficient We saw earlier that to say 
"A if and only if B" means that either A and B are both true, or they are both false. 
We also saw that there are two things we have to do to show this: "if A, then B" 
and "if B, then A". This agrees with what we just learned about "if" and "only 
if". We sometimes also say "the statements A and B are equivalent", or "A is a 
necessary and sufficient condition for B". 

Now we turn to some proof techniques. 
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Proof by contradiction We already met this idea. In order to prove A, we can 
assume that A is false and deduce a contradiction (a statement that is logically 
impossible). We saw two examples of this: the proofs of Pythagoras' Theorem on 
the irrationality of \/2, and Euclid's theorem that there are infinitely many primes. 

Proof by contrapositive This is a fancy way of saying that "A implies 5" is 
logically equivalent to "not-5 implies not-A". We saw an example of this on 

page 2. In order to prove the statement "if is even, then n is even", we proved 
instead its contrapositive, the statement "if n is odd, then is odd". 

Counterexamples Sometimes you will be given a general proposition, and asked 
whether it is true or false. 

Suppose for example you are trying to prove that some property holds for 
every natural number n. Let us call the property A{n). Now: 

• If A(n) is true, then we have to give a general proof for it. 

• If A(n) is false, we only have to give one value of n for which it is not true. 

For example, suppose we are considering the statement "every odd number is 
prime". So A{n) would be, "if n is odd, then n is prime". If this happened to be 
true, we would have to give a proof of it. But it is false, and all we need to say is 
"the number 9 is odd, but is not prime since it is equal to 3 x 3". In this case, we 
say that 9 is a counterexample to the statement that, if n is odd, then n is prime. 

1.3 Proof by induction 

This is a more specialised technique but is very important, so we give it a section 
to itself. 

Suppose that we are trying to prove a statement about all natural numbers. 
Suppose that A{n) is the statement about the particular natural number n. The 
strategy of proof by induction is to do the following: 

(a) Prove the statement A (0), that is, the case when n = 0. 

(b) Prove that, if A{n) is true, then A(n+ 1) is true. In other words, assume 
A{n) and prove A(n + 1). 

Here (a) is called "starting the induction", and (b) is "the inductive step". 

This is a bit confusing at first, since in part (b) we seem to be assuming the 
thing we are trying to prove, namely A(n); an argument where you assume what 
you are trying to prove can't be valid, right? Well, in this case the argument is 
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right. By (a), we know that A (0) is true. Now by (b) (in the case n = 0), we know 
that A (0) implies A(l), so A(l) must be true. By (b) again (with n = 1), we know 
thatA(l) implies A(2), so A(2) must be true. And so on. Given any numbers, we 
can count up to n; and at each step of the way, (b) allows us to get from the truth 
of each statement to the truth of the next. 

Suppose that we have a line of dominos, as shown in the diagram. 



If we push over the first domino, what will happen? It will knock over the 
second, which will knock over the third, and so on; eventually all the dominos 
will fall. This is like induction. The inductive step is the fact that each domino 
knocks over the next one, and starting the induction is giving the first domino a 
push. 

We have a bit of freedom about starting the induction. Instead of 0, it might 
be more convenient to start by proving A(l); this and the inductive step show that 
A(n) is true for all n > 1. We'll see an example soon where we start with A(2). 

Here is an example. What is the sum of the first n positive integers? Induction 
doesn't help us guess the answer, but if we can guess it, induction will let us prove 
that our guess is correct. 

Theorem 1.4 The sum of the first n positive integers is n{n + l)/2. 

Again we can check this for small values: for example, 

1+2 + 3 + 4 + 5 = 15 = 5x6/2. 
Here is the proof by induction. Let A(«) be the statement 

l + 2+--- + /i= ^ ^ ' . 

Starting the induction For n— 1, the left hand side is 1, and the right-hand 
side is 1 X 2/2 = 1; so A(l) is true. 
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The inductive step Suppose that A{n) is true; that is. 




We have to prove that A (n + 1) is true. 

Now the left-hand side of A{n +1) is 1 + 2H \-n+{n+l). Since we are 

assuming that A (n) is true, this is equal to 



after a little bit of algebraic manipulation. But this is exactly the right-hand side 
of A + 1); it is what we get from the expression n(n + l)/2 if we substitute n + l 
in place of n. So the left and right sides of A(n -|- 1) are equal, and A(n -|- 1) is true. 
By induction, we have proved that A(n) is true for all n > 1. 

Unfinished business I told you earlier that if a natural number n is greater than 1 
and is not prime, then it is divisible by some prime number less than n. In other 
words. 

Theorem 1.5 Every natural number n> I has a prime factor 

We prove this theorem by induction. Take A(n) to be the statement "every 
natural number k satisfying I < k < n has a prime factor". We prove A{n) by 
induction. 

Starting the induction We can conveniently start the induction with n = 2: 
there is only one number k satisfying I < k <2, namely k = 2, and it has a prime 
factor, namely 2. [Note: We could start the induction with n— I: there are no 
numbers k satisfying \ <k<\, and so any statement at all is true for all of them! 
But you may feel uncomfortable with this sort of argument!] 

The inductive step We assume that A (n) is true, and we have to prove A{n + 
1). In other words, we assume that every natural number k satisfying \ <k <n 
has a prime factor, and we have to prove that every natural number k satisfying 
1 <k <n+ \ has a prime factor. Well, we don't have to prove it for all these 
numbers, since the hypothesis A{n) shows that it is true for ^ = 2,3, . . . ,n; we 
only have to prove it for ^ = « -|- 1 . 



n{n + l) 



+ (n+l) 



n(n+l) 2(n+l) 

2 ^ 2 
(n+l)(« + 2) 



2 



Case 1 : « -|- 1 is prime. If it is prime, it certainly has a prime factor, namely itself. 
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Case 2: n + 1 is not prime; so n + 1 = for some natural numbers a and b, 
where neither factor is 1. Then each factor must be smaller than n + 1. So, 
for example, 1 <a<n. By A (n), we know that a has a prime factor p. Then 
p is also a factor of n + 1 , and we have finished. 

This completes the proof by induction. 

Here is a variant on the principle of induction. Sometimes you might find this 
easier to apply. 

Suppose that we are trying to prove a statement A{n) . We begin by arguing by 
contradiction: we assume that A{n) isn't true for all values of n, that is, there is 
some value of n for which it is false. So there must be a smallest value of n for 
which A (n) is false. Now this n has the property that A (n) is false but A(m) is true 
for all numbers m smaller than n - so we call n the "minimal counterexample" to 
the statement we are trying to prove. (Some people call n the "least criminal".) If 
we can show that no minimal counterexample can exist, then we have proved that 
A(n) is true for all n. 

Why is this the same as induction? Well, let n be the minimal counterexample, 
and remember we are trying to get a contradiction. Maybe n = 0. To show a 
contradiction, we have to show that A(0) is true. Or maybe n > 0. Now A(n) 
is false and A(« — 1) is true, so if we could show that A{n— 1) implies A(n), we 
would have a contradiction in this case too. So the two things we have to prove are 
precisely the same as starting the induction and doing the inductive step in a proof 
by induction. But sometimes it is easier to think about a minimal counterexample. 

Take an induction proof and try writing it out in the "minimal counterexample" 
style, and see which you prefer. 

1.4 Some more mathematical terms 

There are many other specialised terms in mathematics. 

Theorem, Proposition, Lemma, Corollary These words all mean the same 
thing: a statement which we can prove. We use them for slightly different pur- 
poses. 

A theorem is an important statement which we can prove. A proposition is 
a statement which is less important. (Of the five theorems we've seen so far, I 
would normally call two of them "theorems" and the other three "propositions"; 
can you guess which are which?) A corollary is a statement which follows easily 
from a theorem or proposition. For example, the statement 

Let nbe a natural number. Then n} is odd if and only ifn is odd. 
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follows easily from Theorem 1 in the notes, so 1 could call it a corollary of Theo- 
rem 1 . Finally, a lemma is a statement which is proved as a stepping stone to some 
more important theorem. So I could have called Theorem 1 a lemma for the proof 
of Theorem 2. (Remember how we used Theorem 1 in the proof of Theorem 2.) 

Of course these words are not used very precisely; it is a matter of judgment 
whether something is a theorem, proposition, or whatever. For example, there is a 
very famous theorem called Fermat's Last Theorem, which is the following: 

Theorem 1.6 Let nbe a natural number bigger than 2. Then there are no positive 
integers x, z satisfying + = z". 

This was proved fairly recently by Andrew Wiles, so why do we attribute it to 
Fermat? 

Pierre de Fermat wrote the 
statement of this theorem in the 
margin of one of his books. He 
said, "I have a truly wonderful 
proof of this theorem, but this 
margin is too small to contain it." 
No such proof was ever found, and 
today we don't believe he had a 
proof; but the name stuck. 



Conjecture The proof of Fermat's Last Theorem is rather complicated, and I 
will not give it here! Note that, for about 350 years (between Fermat and Wiles), 
"Fermat's Last Theorem" wasn't a theorem, since we didn't have a proof! A 
statement that we think is true but we can't prove is called a conjecture. So we 
should really have called it Fermat's Conjecture. 

An example of a conjecture which hasn't yet been proved is Goldbach's con- 
jecture: 

Every even number greater than 2 is the sum of two prime numbers. 

To prove this is probably very difficult. But to disprove it, a single counterex- 
ample (an even number which is not the sum of two primes) would do. 

Prove, show, demonstrate These words all mean the same thing. We have 
discussed how to give a mathematical proof of a statement. These words all ask 
you to do that. 
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Converse The converse of the statement "A implies B" (or "if A then 5") is the 
statement "5 implies A". They are not logically equivalent, as we saw when we 
discussed "if" and "only if". You should regard the following conversation as a 
warning! Alice is at the Mad Hatter's Tea Party and the Hatter has just asked her 
a riddle: 'Why is a raven like a writing-desk?' 

'Come, we shall have some fun now!' thought Alice. 'I'm glad they've 
begun asking riddles.-I believe I can guess that,' she added aloud. 

'Do you mean that you think you can find out the answer to it?' said the 
March Hare. 

'Exactly so,' said Alice. 

'Then you should say what you mean,' the March Hare went on. 

'I do,' Alice hastily replied; 'at least-at least I mean what I say-that's 
the same thing, you know.' 

'Not the same thing a bit!' said the Hatter. 'You might just as well 
say that "I see what I eat" is the same thing as "I eat what I see"!' 'You 
might just as well say,' added the March Hare, 'that "I like what I get" is the 
same thing as "I get what I like" ! ' 'You might just as well say,' added the 
Dormouse, who seemed to be talking in his sleep, 'that "I breathe when I 
sleep" is the same thing as "1 sleep when 1 breathe"!' 

'It is the same thing with you,' said the Hatter, and here the conversation 
dropped, and the party sat silent for a minute, while Alice thought over all 
she could remember about ravens and writing-desks, which wasn't much. 

Definition To take another example from Lewis Carroll, recall Humpty Dumpty's 
statement: "When I use a word, it means exactly what I want it to mean, neither 
more nor less". 

In mathematics, we use a lot of words with very precise meanings, often quite 
different from their usual meanings. When we introduce a word which is to have 
a special meaning, we have to say precisely what that meaning is to be. Usually, 
the word being defined is written in itahcs. For example, in Geometry I, you met 
the definition 

An mx n matrix is an array of numbers set out in m rows and n 
columns. 

From that point, whenever the lecturer uses the word "matrix", it has this meaning, 
and has no relation to the meanings of the word in geology, in medicine, and in 

science fiction. 

If you are trying to solve a coursework question containing a word whose 
meaning you are not sure of, check your notes to see if you can find a definition 
of that word. 
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Exercises 

1.1 Write down and prove the contrapositive of the statement 

If X is an irrational number then 1 — x is an irrational number. 

1.2 Find counterexamples to the statements 

(a) Every odd number is prime. 

(b) Every prime number is odd. 

1.3 Prove by induction that 

^2 ^2 2 n(n+l)(2n+l) 

V + hn = -. 

6 

1.4 Let n be a positive natural number, and suppose that n has the property that 
every positive natural number smaller than n/2 divides n. Prove that n < 6, and 
hence find all numbers n with this property. 

1.5 Define the binomial coefficient for natural numbers n and k by the rule 

[n\ _ n-(n-l)---(n-A:+l) ifQ <k<n, 

\k)~ k-{k-\)---\ -UK"-^)! 

^ ^ ^ ' 1 if k>n. 

(Here n\ is the product of the natural numbers from 1 to n.) 

(a) I have given you two definitions here. Prove that they are equivalent. 

(b) Prove that 

k) ^ \k-l) ^\k 

(c) Using this and induction on n, prove the Binomial Theorem: 

Jiun—k 



k=0 

for positive integers n. 
1.6 Prove that 

i=0 



k \k^\ 



14 



CHAPTER 1 . WHAT IS MATHEMATICS ABOUT? 



1.7 Find the mistake in the following proof of the "Theorem": All triangles are 
isosceles. (You will need to draw a figure!) 

Proof Given any triangle ABC, let D be the point inside the triangle where the 
bisector of the angle A meets the perpendicular bisector of the side BC. Now let 
DM be the perpendicular from D to AB and DN be the perpendicular from D to 
AC. 

Step 1 The triangles ADN and ADM are congruent (since they have the same 
angles and they also have the side AD in common). 

Step 2 The triangles CDN and BDM are congruent (since DN = DM from 
Step 1, and DC = DB as DL is the perpendicular bisector of BC by construction, 
and the angles CND and BMD are both right angles). 

Step 3 From Step 1 we have AN = AM, and from Step 2 we have NC = MB. 
Hence AC = AB. 



Chapter 2 
Numbers 



Algebra begins by considering numbers and their properties, and moves on to 
other kinds of mathematical objects. In this section of the notes, we will look at 
numbers. 

The important sets of numbers are: 

• the natural numbers, denoted by N; 

• the integers, denoted by Z; 

• the rational numbers, denoted by Q; 

• the real numbers, denoted by M; 

• the complex numbers, denoted by C. 

The notation we use for them is a special typeface called "blackboard bold". Orig- 
inally, number systems were printed in bold type: N,Z, etc.; lecturers writing on 
the blackboard couldn't write in bold, so invented a different way of doing it; then 
the printers had to catch up by designing a typeface. 

The notation N, R and C for natural, real and complex numbers is easy to 
remember; but what about the others? If the real numbers are called R, then we 
need a different letter for the rational numbers; we choose Q for "quotients", since 
every rational number has the form a/b where a and b are integers. The Z comes 
from the German word Zahlen, meaning numbers. 

In this section, you will not learn definitions of numbers. I will assume that 
you know what numbers are; we will revise some of their properties. 
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2.1 The natural numbers 



The German mathematician Leopold 
Kronecker (pictured) said, "God made the 
natural numbers; all the rest is the work of 
man." In the same spirit, the French 
mathematician Emil Borel said, "All of 
mathematics can be deduced from the sole 
notion of an integer; here we have a fact 
universally acknowledged today." 



The important properties of the natural numbers are: 

(a) They are used in counting. We can start from zero and, in principle, count up 
a step at a time to reach any natural number. (Of course there are practical 
limits !) This is the basis of proof by induction, as we saw in the last chapter. 

(b) We can add and multiply natural numbers. These operations satisfy a num- 
ber of familiar laws that you probably never stopped to think about. These 
include: 

a + b — b + a, ab — ba, 
(a + b) +c = a + (b + c), {ab)c = a{bc), 
a{b + c) = ab + ac, 

0-|-a = a, la = a. 

These laws are important to us, and they have been given names, which you 
will need to know. The first two are the commutative laws (for addition 
and multiplication respectively), the next two are the associative laws (for 
addition and multiplication), the fifth is the distributive law, and the last two 
are the identity laws (for addition and multiplication). 

(c) Although we can add and multiply, we cannot always subtract or divide 
natural numbers. There is no natural number x such that 4 + x = 2, and no y 
such that 3_y = 5. 

The facts that subtraction and division are not possible in the natural numbers 
can be viewed another way. Since we can think of subtraction as "adding the 
negative" and division as "multiplying by the reciprocal", we can formulate two 
further laws known as the inverse laws to describe the situation. These are laws 
which do not hold for the natural numbers! 



2.2. THE INTEGERS 
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Additive inverse law: For any element a, there exists an element —a such that 
a + (—a) = 0. 

Multiplicative inverse law: For any element a^O, there exists an element 
such that a -a"^ = 1. 

Notice the exclusion in the multiplicative inverse law; we can't divide by zero! 

The laws for the natural numbers can be interpreted in terms of counting. This 
depends on two obvious principles: 

• a row of a dots, followed by a row of b dots, contains a + b dots. 

• a rectangle of dots with sides a and b contains ab dots. 
The figure illustrates this for a = 2 and b = 3. 

• • • 

• •••• ••• 

2+3=5 3x2=6 



Now the laws of algebra can be explained by geometric transformations. For 
example, the picture below shows the commutative law for addition and the dis- 
tributive law. In the first case, we have reflected the figure left-to-right. 



3+2=2+3 



(3 + 4)x2 = 3x2 + 4x2 



You are invited to produce similar geometric explanations of the commutative law 
for multiplication and the associative laws. 



2.2 The integers 

We enlarge the number system because we are trying to solve equations which 
can't be solved in the original system. At every stage in the process, people first 
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thought that the new numbers were just aids to calculating, and not "proper" num- 
bers. The names given to them reflect this: negative numbers, improper fractions, 
irrational numbers, imaginary numbers! Only later were they fully accepted. You 
may like to read the book Imagining Numbers by Barry Mazur, about the long 
process of accepting imaginary numbers. 

Anyway, we can't always subtract natural numbers, so we add negative num- 
bers to make it possible. The integers are the natural numbers together with their 
negatives. So addition, subtraction, and multiplication are all possible for inte- 
gers. The laws we met for natural numbers all continue to hold for integers. Also, 
the additive inverse law (but not the multiplicative inverse law) holds for integers. 

The natural numbers 1,2, .. . are positive, while —1,-2, ... are negative. In- 
tegers satisfy the law of signs: the product of a positive and a negative number is 
negative, while the product of two negative numbers is positive. 



2.3 The rational numbers 



In a similar way, rational numbers are introduced because we cannot always divide 
integers. A rational number is a number which can be written as a fraction 



a 
b 



where a and b are integers and b^O. We require that multiplying or dividing nu- 
merator and denominator (top and bottom) of a fraction by the same thing doesn't 
change the fraction. So, if the denominator is negative, we can multiply by — 1 
to make it positive; and if numerator and denominator have a common factor, we 
can divide by it. (We say that a fraction a/b is in its lowest terms if the highest 
common factor of a and bis 1.) 

We can write rules for adding and multiplying rational numbers: 



a c ad + bc 
b d bd ^ 



ac 



a c 
b^d^bd' 



a c ad — be 

b d bd 
a c ad 



b d be 



ifc^O. 



The last rule says: to divide by a fraction, turn it upside down and multiply. 

So, for rational numbers, addition, subtraction, multiplication, and division 
(except by 0) are all possible. The rules we met for natural numbers all hold for 
rational numbers, and so do the two inverse laws. 



2.4. THE REAL NUMBERS 
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2.4 The real numbers 

There are still many equations we can't solve with rational numbers. One such 
equation is = 2. (we saw Pythagoras' proof of this in the last chapter.) Other 
equations involve functions from trigonometry (such as sinx = 1, which has the ir- 
rational solutions = 7t/2) and calculus (such as logx = 1, which has the irrational 
solution x = e). 

So, we take a larger number system in which these equations can be solved, 
the real numbers. A real number is a number that can be represented as an infinite 
decimal. This includes all the rational numbers and many more, including the 
solutions of the three equations above; for example, 

I = 0.4 

i = 0.142857142857..., 

\/2 = 1.41421356237..., 

f = 1.57079632679..., 

e = 2.71828182846... 

In the last three cases, we cannot write out the number exactly as a decimal, but 
we assume that the approximation gets better as the number of digits increases. 

We can add, subtract, multiply, and divide (except by zero) in the system of 
real numbers, and the laws we met earlier (including the inverse laws) all hold 
here too. 

2.5 The complex numbers 

The final extension arises because there are still equations we can't solve, such as 
= —\ (which has no real solution) ox x^ = 2 (which has only one, though for 
various reasons we would like it to have three). It turns out that the first equation 
is the crucial one. 

A complex number is a number of the form a + M, where a and b are real 
numbers, and i is a mysterious symbol which will have the property that i^ = — 1. 
The rules for addition and multplication are 

(a + M) + (c + Ji) = {a + c) + {b + d% 
{a + bi){c + di) = {ac — bd) + {ad + bc)i. 

You can work out the rule for subtraction. How do we divide? You can check that 
the rule above gives 

(a + M)(fl-M) = fl^ + &^. 
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which is a positive number unless a = b = 0. So, to divide by a + hi, we multiply 
by 

(—) - (— 

Thus, in the complex numbers, we can add, subtract, multiply, and divide 
(except by zero), and the laws we met earlier (including the inverse laws) all apply 
here too. 

Complex numbers are not called complex because they are complicated: a 
modern advertising executive would certainly have come up with a different name! 
They are called "complex" because each complex number is built of two parts, 
each of which is simpler (being a real number). 

Here, unlike for the other forms of numbers, we don't have to take on trust 
that the laws hold; we can prove them. Here, for example, is the distributive law. 
Let zi = ai + bii, Z2 = «2 + ^2i> and 23 = 03 + b^i. Now 

21(22 + 23) = (ai + &ii)((a2 + a3) + (&2 + ^3)i) 

= (ai (fl2 + 03) - &i {b2 + ^3)) + «i {bi + h) + bi (02 + «3))i, 

and 

Z1Z2+Z1Z3 = {{aiai - b\b2) + (ai^2 + «2^l)i) + ((«l«3 - ^1^3) + («i^3 +«3^l)i) 
= (aia2 — bib2 + a\ci3, — ^1^3) + (aiZ72 + «2^i +^Ji^3 +«3^i)i, 

and a little bit of rearranging shows that the two expressions are the same. 

If z = a + l?i is a complex number (where a and b are real), we say that a and b 
are the real part and imaginary part of z respectively. The complex number a — bi 
is called the complex conjugate of z, and is written as z. So the rules for addition 
and subtraction can be put like this: 

To add or subtract complex numbers, we add or subtract their real 
parts and their imaginary parts. 

The rule for multiplication looks more complicated as we have written it out. 
There is another representation of complex numbers which makes it look simpler. 
Let z = a + bi. We define the modulus and argument of z by 

|z| = V a^ + b^, 
arg(z) = where cos0 = (2/|z| and sin0 = &/|z|. 

In other words, if |z| = r and arg(z) = 6, then 

z = r(cos0 + isin0). 

Now the rules for multiplication and division are: 
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To multiply two complex numbers, multiply their moduli and add 
their arguments. To divide two complex numbers, divide their moduli 
and subtract their arguments. 

2.6 The complex plane, or Argand diagram 

The complex numbers can be represented geometrically, by points in the Eu- 
clidean plane (which is usually referred to as the Argand diagram or the complex 
plane for this purpose. The complex number z = a + biis represented as the point 
with coordinates (a^b). Then |z| is the length of the line from the origin to the 
point z, and arg(z) is the angle between this line and the jc-axis. See Figure 2.1. 




z = a + bi 
b = rsin0 



a = rcos 



Figure 2. 1 : The Argand diagram 

In terms of the complex plane, we can give a geometric description of addition 
and multiplication of complex numbers. The addition rule is the same as you 
learned for adding vectors in Geometry I, namely, the parallelogram rule (see 
Figure 2.2). 

Multiplication is a Uttle bit more compUcated. Let z be a complex number 
with modulus r and argument 0, so that z = r(cos0 -|-isin0). Then the way 
to multiply an arbitrary complex number by z is a combination of a stretch and a 
rotation: first we expand the plane so that the distance of each point from the origin 
is multiplied by r; then we rotate the plane through an angle 6. See Figure 2.3, 
where we are multiplying by 1 -|-i = v^(cos(;r/4) -|-isin(;r/4)); the dots represent 
the stretching out by a factor of and the circular arc represents the rotation by 
Tt/A. 

Now let's check the correctness of our rule for multiplying complex numbers. 
Remember that the rule is: to multiply two complex numbers, we multiply the 
moduli and add the arguments. To see that this is correct, suppose that z\ and 








Figure 2.2: Addition of complex numbers 

(3 + 2i)(l+i) 
= l+5i 








Figure 2.3: Multiplication of complex numbers 

Z2 are two complex numbers; let their moduli be ri and r2, and their arguments 
01 + 02. Then 

Zi = ri(cos0i +isin0i), 
Z2 = r2(cos02 + isin02)- 

Then 

ZiZ2 = rir2(cos0i +isin0i)(cos02 + isin02) 

= ''1 '"2 ( (cos 6i cos 62 — sin Oi sin O2) + (cos Oi sin 62 + sin 61 cos 62)1) 
= rir2(cos(0i + 02) + isin(0i + 02)), 
which is what we wanted to show. 
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From this we can prove De Moivre's Theorem: 

Theorem 2.1 For any natural number n, we have 

(cos0 +isin0)" = cosnO + ismn6. 



Proof The proof is by induction. Starting the induction is easy since (cos 6 + 
isin0)*^ = 1 and cosO + isinO = 1. 

For the inductive step, suppose that the result is true for n, that is, 

(cos0 +isin0)" = cosn0 + isinn0. 

Then 

(cos + i sin 0)" • (cos + i sin 0) 
(cos n + i sin n ) (cos + i sin ) 
cos(n+ 1)0 + isin(n+ 1)0, 

which is the result for n + l. So the proof by induction is complete. 

Note that, in the second line of the chain of equations, we have used the in- 
ductive hypothesis, and in the third line, we have used the rule for multiplying 
complex numbers. 

The argument is clear if we express it geometrically. To multiply by the com- 
plex number (cos -f- i sin 0)", we rotate n times through an angle 0, which is the 
same as rotating through an angle nO. 

De Moivre's Theorem is useful in deriving trigonometrical formulae. For ex- 
ample, 

cos30 + isin30 = (cos0 -|-isin0)^ 

= (cos^ - 3 cos sin^ 0) + (3 cos^ sin - sin^ 0)i, 

so 

cos 30 = cos — 3 cos sin 0, 
sin30 = 3cos 0sin0 — sin 0. 

These can be converted into the more familiar forms cos 30 = 4cos^ — 3 cos 
and sin30 = 3 sin — 4sin^ by using the equation cos^ + sin^ = 1. 



(cos0 + isin0) 



n+l _ 
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Exercises 

2.1 Prove by induction or otherwise that 

1 1 1 _n-l 

1-2 2-3 {n—l)-n n 

2.2 Use De Moivre's Theorem to express cos4x as a polynomial in cosx, and to 
express sin4x as a polynomial in sinx. 

2.3 Find 

^2+-y2 + V2 + ---. 

2.4 The quaternions form a number system discovered by Hamilton. They have 
the form a + M + cj + dk, where a,b,c,d eM. and i, j, k are new symbols which 
satisfy 

i2=j2 = k2 = ijk= -1. 

(a) Write down rules for the sum and product of two quaternions. 

(b) Show that the associative law for multiplication holds for quaternions. 

(c) Show that (a + M + cj + dk) (a-bi-cj- dk) = (a^ + + + d^), and 
hence show that the quaternions satisfy the inverse law for multiplication 
(that is, every non-zero quaternion has a multiplicative inverse). 



Chapter 3 

Other algebraic systems 



In this section, we will look at other algebraic systems which have operations 
which resemble addition and multiplication for number systems. These operations 
satisfy some of the laws which hold for numbers, but not necessarily all of them. 
A reminder: we are interested in the following laws: 

Commutative laws: a + b = b + a, ab = ba 
Associative laws: {a + b) + c — a+ {b + c), {ab)c — a{bc) 
Distributive law: a{b + c) =ab + ac 
Identity laws: + a = a, la = a 

Inverse laws: For all a there exists {—a) such that a + {—a) = 0; for all Uy^O, 
there exists a^^ such that a-a^^ — 1. 

We have to be a bit careful about what the identity laws mean, since in other alge- 
braic systems there will not be numbers and 1 to use here. The identity law for 
multiplication should mean that there is a particular element e (say) in our system 
such that ea — a for every element a. In the case of number systems, the number 
1 has this property. Similarly we have to be careful about the interpretation of —a 
and in the inverse laws. But notice that we don't even have to try to check the 
additive or multiplicative inverse laws imless the additive or multiplicative identity 
laws hold. 

3.1 Vectors 

In Geometry I, you learned how to add 3-dimensional vectors, and two different 
ways to multiply them: the scalar product or dot product, and the vector product 
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or cross product. Given two vectors u, v, we denote their sum by u + v, their dot 
product by u ■ V, and their cross product by u x v. 

(We can't do something like u in handwriting, or writing on the blackboard. 
So you should write the vector u as u, as you did in the Geometry I course.) 

Remember that we can represent a vector by a column consisting of three 
numbers; for example. 



u = 2i+j-5k= (-11. 
Addition The commutative and associative laws hold for vector addition; so 

C' 

does the zero and inverse laws, if we take the vector = I to be the zero 
element: 



u + v 

(u + v) + w 
+ v 

v+(-v) 



v + u, 
u+ (v + w) 

0. 



These can all be proved by a calculation. For example, here is a proof of the 
associative law. Let 



u 



w 



Then 



(U+V)+W: 



u+ (v + w) 




{a + p)+x^ 

{b + q)+y 
{c + r)+z 

I (3 + (p+Jc) 

b + {q + y) 
\c + (r + z) 



and {a-\-p)+x = a-\-{p-\-x), etc., since the associative law holds for addition of 
real numbers. So the two expressions are equal. 

Notice what we have done here: we used the associative law for the real num- 
bers to prove it for 3 -dimensional vectors. 



3.1. VECTORS 
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Scalar product Asking about the associative law or other laws for the scalar 
product doesn't really make sense, since the scalar product of two vectors is a 
number, not a vector! So (u ■ v) ■ w is meaningless. 

The lesson is that the operations we will be studying must take two objects of 
some kind and combine them into another object of the same kind. 

Vector product Remember the formula for the vector product: 

i a X 
j b y , 
k c z 

or, to put it another way, 

(ai + bi + ck) X (ja + yj + zk) = (bz -cy)i+ (cx - az)j + (ay -bx)k. 

(This was not the definition, but it was proved in Part 5 of the notes that this 
formula holds.) 

What properties does it have? You also met these properties in the Geometry I 
course. 

Associative law: This does not hold. Remember that v x v = for any vector v. 
Now 

(ixi)xj = Oxj = 0, 
ix(ixj) = ixk=-j. 

(Remember that to disprove something like the associative law, a single 
counterexample is enough!) 

Commutative law: This does not hold either. In fact, I hope you remember from 
Geometry I that 

uxv = -(vxu) 

for any two vectors u and v. To get a specific counterexample, we could 
observe that 

ixj = k, jxi = -k. 
Distributive law: This one is true: 

u X (v + w) = (u X v) + (u X w). 
How do you prove this? 
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Identity law: This one also fails. There cannot be a vector e with the property 
that e X V = V for any choice of v, because e x v is always perpendicular 
to v! 

The lesson here is that even nice operations might fail to satisfy the usual laws 
for numbers. 



3.2 Matrices 

Matrices form another class of objects which can be added and multiplied. We 
will consider just 2x2 matrices, as these illustrate the general principles. Recall 

the rules, that you learned in Geometry 1. Let A = | ^ ) and B = ( ^ 



c d J \gh 
two matrices. We will take the entries a, . . . , /i to be arbitrary real numbers. 

Addition The sum of two matrices A and B is the matrix obtained by adding 
corresponding elements of A and B: 

a b\^fe f\^fa + e b + f 
c d J \g h) \c + g d + h 

Multiplication The rule for multiplication is more complicated: 

a b\ f e f \ _ f ae + bg af + bh 
c d J \g h J \ce + dg cf + dh 

It works like this. To work out the entry in the first row and second column of the 
product AB, we take the first row of A (which is ( a b)), and the second column 

of 5 (Which is ({) ; multiply co„e.po„di„g elements (a by /. and I, by A), and 

add the products, to get af + bh. The rule for the other entries in AB is similar. 
Do these operations satisfy the laws we wrote down earUer? 

Addition The commutative, associative, identity, and inverse laws all hold. 

To verify that A+5 = 54-A, we have to show that corresponding entries of 
these matrices are equal. These entries are obtained by adding corresponding 
entries in A and B in either order; the results are equal. In detail, 

a b\^fe f\ ^ fa + e b + f 
c d) \g hj \c+g d+h 

e f\:(a b\ ^ fe + a f + b 
g h) \c dj \g+c h+d 



3.2. MATRICES 
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and the matrices on the right are equal because a + e = e + a etc. 

The associative law is true, and the argument to prove it is similar. If we define 
the zero matrix to be 

02x2= o)' 

then we have 02x2 + ^ = ^ for any matrix A; for, to work out 02x2 we add 
zero to each entry of A, which doesn't change it. Similarly, for any matrix A, 
we let —A be the matrix whose entries are the negatives of the entries of A; then 
A + {-A) = 0. 



Multiplication Here we find our first surprise: The commutative law for multi- 
plication fails! Remember that to disprove a general assertion, we only need one 
counterexample: 

1 2\ /5 6\_/l9 22\ /23 34\ _ /5 6Wl 2 
3 4)\1 8y'~V43 50y''^V31 46^ ~ VV s)\3 4 

[How did I find this example? Trial and error; I wrote down the first two matrices 
I could think of, multiplied them both ways round, and found that the results were 
different.] 

Despite this, the associative law and the identity law do both hold for matrix 
multiplication. For the associative law, there is no alternative but to multiply it out 
and see: 

a b\((e f\(i j\\ ^ (a b\(ei + fk ej + fl\ 
c dj\\g hj\k Ijj \c dj\gi + hk gj + hlj 



a{ei + fk) + b{gi + hk) 



a ^^ Z^" f\\ ( i A ^ / ae + bg af + bh\ f i 
c d)\g h))\k I J \ce + dg cf + dhj\k 

(ae + bg)i + (a/ + bh)k 



Algebraic manipulation shows that 

a{ei + fk) +b{gi + hk) = {ae + bg)i + {ce + dg)k. 

[Take a look at this manipulation. We first expand the brackets on the left, using 
the distributive law. This gives a{ei) +a{fk) +b{gi) +b{hk). Now use the associa- 
tive law for multiplication to switch this into {ae)i + {af)k + {bg)i + {bh)k. Then 
the commutative law for addition changes this to {ae)i+ {bg)i+ {af)k+ {bh)k. 
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and the distributive law once more turns this into the right-hand expression. Oh, 
and I forgot to mention that I used the associative law for addition without telling 
you, when I wrote down the sum of four terms without telling you where the 
brackets go! So almost all the laws for real numbers get used.] 

To prove the identity law for multipUcation, we have to know what the identity 
matrix is. Since the zero matrix has every entry zero, you might guess that the 
identity matrix has every entry 1, but it doesn't: 

(l 0(3 4) = (4 I)- 
In fact the identity matrix has ones on the main diagonal and zeros elsewhere: 

We have I2A = A for any 2x2 matrix A: 

fl 0\fa b\_fa b 
\0 lj\c d) ~\c d 

Now another possible problem might occur to you. Since multiplication is not 
commutative, is it true that AI2 = A for any A? Well, yes it is: 

fa b\ (\ Q\ _(a b 
\c d)\Q l) d 

as you can check. [You may also notice that, as well as the identity law for mul- 
tiplication, we use the fact that Oa = for any real number a and the zero law for 
addition.] 

The inverse law for multiplication does not hold. For example, if A = 

then there is no matrix B such that AB = I2. You learned in Geometry I that the 
condition for a matrix to have an inverse is that its determinant is not zero. 

Distributive law: I leave it to you to check that 

A(5 + C) =A5+AC 

for any matrices A,5,C. You might even want to check which laws for real num- 
bers are used in the proof. Because multiplication is not commutative, we can also 
check the other way round: 



{B^C)A = BA + CA 
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for any matrices A, fi, C. 

We use the notation M2x2(I^) for the set of all 2 x 2 matrices with real numbers 
as entries. (We call these matrices "real matrices" for short.) As you can see, we 
can easily generalise this notation. By changing the subscript, we can talk about 

the set of matrices of different size, say 3x3; and by putting Z, Q or C in place 
of M, we can talk about matrices whose entries are integers, rational numbers, or 
complex numbers. 



You can think of a polynomial as a function which can be written as a sum of 

terms, each of which is a power of x multiplied by a constant. So "the polynomial 
x^" should really be "the polynomial Ix^". We write x^ as x, and leave out x^ 
altogether (just writing the constant). If the coefficient of a power of x is zero, we 
usually don't bother writing it: so we write 2x^ + 3 rather than 2x^ + 0x + 3. Of 
course, if all the terms are zero, we have to write something; so we just write 0. 
So a typical polynomial has the form 



Note that a constant ao is a special kind of polynomial called a constant polyno- 
mial. 

The degree of a polynomial is the largest number n such that the polynomial 
contains a term a„x" with a„ 7^ 0. Thus, a non-zero constant polynomial has de- 
gree 0, since it has the form uqx^. The zero polynomial doesn't have a degree, 
since it doesn't have any non-zero terms! [Be warned: some people say that it 
has degree —1; others say that it has degree Of course, these are merely 
conventions.] 

Addition and multiplication You already know how to add and multiply poly- 
nomials. But it is difficult to give a proper mathematical definition. For example. 



We can't just say "add corresponding terms", since some terms may be missing; 
we have first to put the missing terms in with coefficients 0. For multiplication, we 
multiply each term of the first factor by each term of the second, and then gather 
up terms involving the same power of x: 



3.3 Polynomials 



a„x" -|-a„_ix" \-a\x-\-aQ. 



(2x^-^3) + {x^+x-5) =x^ + l}p-+x-l. 



(2x^ + 3) 



) 



= 2jc^ + (2x^ + 3x^)-10x2 + 3x-15 
= 2jc^ + 5x^-10x^ + 3x-15. 
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I ask you to take on trust for now that it is possible to give good definitions of 
addition and multiplication of polynomials, and to show that they do satisfy the 
commutative, associative and identity laws for both addition and multiplication, 
the inverse law for addition, and the distributive law. 

We use the notation ]R[jc] for the set of all polynomials with real numbers as 
coefficients. (We call them "real polynomials" for short.) As you can see, this 
notation can be generalised: Q[x] and C[x] denote the sets of polynomials with 
rational or complex numbers as coefficients. These sets satisfy the same rules for 
addition and multipUcation as the real polynomials. 

3.4 Sets 

Here is another example where we have an operation or rule of combination for 
objects which are nothing like numbers. 

Let yhea set. We regard it as a "universal set"; in Probability I, it was called 
the sample space. Our objects will be subsets of y. 

Two operations which can be performed on sets are union and intersection, 
defined as follows: 

Union: the union of two sets A and B is the set of all elements lying in either A 
or 5: 

AUB = {x : X e A or X e B}. 
We read A U B as "A union 5", or "A or 5". 

Intersection: the intersection of two sets A and B is the set of all elements lying 
in both A and B: 

An B = {x : X e A and X e B} . 
We read A n 5 as "A intersection B", or "A and 5". 

We can represent sets by Venn diagrams, and show these two operations in a 
diagram as follows: A B A B 




AUB AnB 



Here are some laws they satisfy. 



Commutative laws AUB = BUA AnB = BOA 

Associative laws AU(5UC) = (AU5)UC An (5nC) = (An5) nC 
Identity laws AU0 = A Any=A 
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When we come to the distributive law, there is a small surprise. To write down 
the distributive law for numbers, we have to distinguish between addition and 
multiplication. It is true that 



For sets, which of our two operations should play the role of addition, and which 
should be multiplication? 

It turns out that it works both ways round. We can replace "plus" by "or" and 
"times" by "and", or vice versa: 

Distributive laws An (5UC) = (An5) U (ARC) A U (5nC) = (A U5) n (A UC) 

All of these assertions have similar proofs: draw a Venn diagram to convince 
yourself, and then give a mathematical argument. Here is the proof of the first 
distributive law. I leave the Venn diagram to you. 

xeAr\{BUC) 4^ xeAandxeBUC 

4^ {xeA and xeB) or {xeA and xeC) 

^ xe (An5)u(Anc). 

So the two sets A n (5 U C) and (A n 5) U (A n C) have the same members, and 
hence are equal. 

The inverse laws are not true. For example, we saw that the zero element for 
the operation of union is the empty set 0; and, given a set A which is not the empty 
set, it is impossible to find a set B such that A U 5 = 0, since A U 5 is at least as 
large as A. The failure of the inverse law for intersection is similar. 

In Probability I, you saw several other operations on sets: difference, symmet- 
ric difference, and complement. You might like to check which of our laws are 
satisfied by difference, or by symmetric difference, for example. 



ax (b + c) 



{axb) + {axc), 



but it is not true that 



a+{bxc) 



{a + b)x (a + c). 



Exercises 



3.1 




(b) Find the inverse of 
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3.2 Find two matrices having entries and 1 only which do not commute with 
each other. 

3.3 Show that the symmetric difference of sets satisfies the associative, commu- 
tative, identity and inverse laws, where the identity element is the empty set and 
the inverse of any set A is equal to A. 

3.4 Recall the definition of the quaternions from the last chapter. 

(a) Show that any quaternion can be formally written as a + v, where a e R and 
V is a 3 -dimensional real vector. 

(b) Show that 



{a + \) + {b + w) 



{a + b) + {\ + w), 

{at — V • w) + (aw + Z7V + V X w) , 



where • and x denote the dot and cross product of vectors. 



Chapter 4 

Relations and functions 



4.1 Ordered pairs and Cartesian product 

We write {x,y} to mean a set containing just the two elements x and y. More 
generally, i , X2 , . . . , x„ } is a set containing just the n elements xi,X2, ...,x„. 

The order in which elements come in a set is not important. So {y,x} is the 
same set as {x, j}. This set is sometimes called an unordered pair. 

Often, however, the order of the elements does matter, and we need a different 
construction. We write the ordered pair with first element x and second element y 
as {x^y); this is not the same as {y,x) unless x and y are equal. You have seen this 
notation used for the coordinates of points in the plane. The point with coordinates 
(2, 3) is not the same as the point with coordinates (3, 2). The rule for equality of 
ordered pairs is: 

(jc,_y) = (m, v) if and only iix = u and y = v. 

This notation can be extended to ordered n-tuples for larger n. For example, a 
point in three-dimensional space is given by an ordered triple (jc, y, z) of coordi- 
nates. 

The idea of coordinatising the plane or 
three-dimensional space by ordered pairs or triples 
of real numbers was invented by Descartes. In his 
honour, we call the system "Cartesian coordinates". 
This great idea of Descartes allows us to use 
algebraic methods to solve geometric problems, as 
you saw in the Geometry I course last term. 

By means of Cartesian coordinates, the set of all points in the plane is matched 
up with the set of all ordered pairs (jc,};), where x and y are real numbers. We call 
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this set R X R, or R^. This notation works much more generally, as we now 
explain. 

Let X and Y be any two sets. We define their Cartesian product X x 7 to be 
the set of all ordered pairs (jc, j), with x eX and y ^Y; that is, all ordered pairs 
which can be made using an element of X as first coordinate and an element of Y 
as second coordinate. We write this as follows: 

XxY = {{x,y) -.xeX^yeY}. 

You should read this formula exactly as in the explanation. The notation 

{x:P} 

means "the set of all elements x for which P holds". This is a very common way 
of specifying a set. 

If Y = X, we write X xY more briefly as X^. Similarly, if we have sets 
Zi , . . . , Z„, we let Zi x ■ ■ ■ x Z„ be the set of all ordered n-tuples (;ci , . . . , x„) such 
that xi EXi, . . . , jc„ G X„. If Xi = X2 = ■ ■ ■ = Xn = X, say, we write this set as X". 

If the sets are finite, we can do some counting. Remember that we use the 
notation \X\ for the number of elements of the setX (not to be confused with 
the modulus of the complex number z, for example). 

Proposition 4.1 LetX and Y be sets with \X\= p and \ Y\ = q. Then 

(a) |X X y| = pq; 

(b) |X|" = /7". 

Proof (a) In how many ways can we choose an ordered pair (x, y) with x EX and 
y EYI There are p choices for x, and q choices for y; each choice of x can be 
combined with each choice for y, so we multiply the numbers, 
(b) This is an exercise for you. 

The "multiplicative principle" used in part (a) of the above proof is very im- 
portant. For example, if X = {1,2} and Y = {a,Z7,c}, then we can arrange the 
elements of X x 7 in a table with two rows and three columns as follows: 

(l,fl) {l,b) (l,c) 
{2, a) {2,b) (2,c) 
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4.2 Relations 

Suppose we are given a set of people Pi,...,Pn. What does the relation of being 
sisters mean? For each ordered pair {Pi,Pj) , either P,- and Pj are sisters, or they are 
not; so we can think of the relation as being a rule of some kind which answers 
"true" or "false" for each pair {Pi,Pj). Mathematically, there is a more abstract 
way of saying the same thing; the relation of sisterhood is the set of all ordered 
pairs (Pi.Pj) for which the relation is true. (When I say that Pi and Pj are sisters, I 
mean that each of them is the sister of the other.) 

So we define a relation R on a set X to be a subset of the Cartesian product 
= X X X; that is, a set of ordered pairs. We think of the relation as holding 
between x and y if the pair {x,y) is in R, and not holding otherwise. 

Here is another example. Let X = {1,2,3,4}, and let R be the relation "less 
than" (this means, the relation that holds between x and y if and only if x < y). 
Then we can write Rasa set by listing all the pairs for which this is true: 

i? = {(l,2),(l,3),(l,4),(2,3),(2,4),(3,4)}. 

How many different relations are there on the set X = {1,2, 3,4}? A relation 
on Z is a subset of X x X. There are 4x4= 16 elements inX xX, by Proposi- 
tion 4.1. How many subsets does a set of size 16 have? For each element of the 
set, we can decide to include that element in the subset, or to leave it out. The two 
choices can be made independently for each of the sixteen elements of X^, so the 
number of subsets is 

2x2x---x2 = 2^^ = 65536. 

So there are 65536 relations. Of course, not all of them have simple names like 
"less than". 

You will see that a relation like "less than" is written x <y; in other words, 
we put the symbol for the relation between the names of the two elements making 
up the ordered pair. We could, if we wanted, invent a similar notation for any 
relation. Thus, if is a relation, we could write xRy to mean {x,y) e R. 

4.3 Equivalence relations and partitions 

Just as there are certain laws that operations like multiplication may or may not 
satisfy, so there are laws that relations may or may not satisfy. Here are some 
important ones. 

Let Rhea relation on a set Z. We say that R is 

reflexive if e for all x e Z; 
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symmetric if {x,y) e R implies that {y,x) e R; 

transitive if {x,y) e R and (_y,z) e i? together imply that (jc,z) e R. 

For example, the relation "less than" is not reflexive (since no element is less 
than itself); is not symmetric (since x < y and y < x cannot both hold); but is 
transitive (since x<y and y < zdo imply that x < z). The relation of being sisters 
is not reflexive (it is debatable whether a girl can be her own sister, but a boy 
certainly cannot!), but it is symmetric. It is "almost" transitive: if x and y are 
sisters, and y and z are sisters, then x and z are sisters except in the case when 
X = z. But this case can actually occur, so the relation is not transitive. (For it to 
be transitive, the transitive law would have to hold without any exceptions.) 

A very important class of relations are called equivalence relations. An equiv- 
alence relation is a relation which is reflexive, symmetric, and transitive. 

Before seeing the job that equivalence relations do in mathematics, we need 
another definition. 

Let X be a set. A partition of X is a collection {Ai,A2, . . .} of subsets of X 
having the following properties: 

(a) Ai^(d; 

(b) AiHAj = (d for ij^ j; 

(c) AiUA2U-- - = X. 

So each set is non-empty; no two sets have any element in common; and between 
them they cover the whole of X. The name arises because the set X is divided into 
disjoint parts Ai ,A2, .... 





A2 


A3 


A4 


As 



The statement and proof of the next theorem are quite long, but the message 
is very simple: the job of an equivalence relation on X is to produce a partition of 
X; every equivalence relation gives a partition, and every partition comes from an 
equivalence relation. This result is called the Equivalence Relation Theorem. 

First we need one piece of notation. Let i? be a relation on a set Z. We write 
R{x) for the set of elements of X which are related to R; that is. 



R{x) = {yeX:{x,y)eR}. 
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Theorem 4.2 (a) Let R be an equivalence relation on X. Then the sets R{x), 
for x&X, form a partition ofX. 

(b) Conversely, given any partition {Ai,A2, . . .} ofX, there is an equivalence 
relation RonX such that the sets Ai are the same as the sets R{x) for x e X. 

Proof (a) We have to show that the sets R{x) satisfy the conditions in the defini- 
tion of a partition of X. 

• For any x, we have {x,x) E R (since R is reflexive), soxER{x); thus R{x) ^ 
0. 

• We have to show that, if R{x) R{y), then R{x) r\R{y) = 0. The contrapos- 
itive of this is: ifR{x) r\R{y) ^ 0, then R{x) =R{y); we prove this. Suppose 
that R{x) n R{y) ^ 0; this means that there is some element, say z, lying in 
both R{x) and R{y). By definition, (x, z) eR and {y,z) e R; hence {z.,y) e R 
by symmetry and {x,y) E Rhy transitivity. 

We have to show that R{x) = R{y); this means showing that every element 
in R{x) is in R{y), and every element of R{y) is in R{x). For the first claim, 
take u E R{x). Then (x, u) E R. Also (y^x) E R (by symmetry; we know that 
{x,y) E R; so (y, u) ERhy transitivity, and uER{y). Conversely, if uER{y), 
a similar argument (which you should try for yourself) shows that uER{x). 
So R{x) = R{y), as required. 

• Finally we have to show that the union of all the sets R{x) is X, in other 
words, that every element of X lies in one of these sets. But we already 
showed in the first part that x belongs to the set R{x). 

(b) Suppose that {Ai,A2, . . .} is a partition of x. We define a relation R as 
follows: 

R = {{x,y) : X and y lie in the same part of the partition}. 

Now 

• X and X lie in the same part of the partition, so R is reflexive. 

• If X and y lie in the same part of the partition, then so do y and x; so R is 
symmetric. 

• Suppose that x and y lie in the same part A, of the partition, and y and z lie 
in the same part A^. Then y E Ai and y EAj; and so we have A,- = Aj (since 
different parts are disjoint). Thus x and z both lie in A,. So R is transitive. 
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Thus R is an equivalence relation. But clearly R{x) consists of all elements lying 
in the same part of the partition as x; so, if x e Aj, then R{x) = Aj. So the partition 
consists of the sets R{x). 

If /? is an equivalence relation, then the sets R{x) (the parts of the partition 
corresponding to R) are called the equivalence classes of R. 

Here is an example. There are five partitions of the set {1,2,3}. One has a 
single part; three of them have one part of size 1 and one of size 2; and one has 
three parts of size 1. Here are the partitions and the corresponding equivalence 
relations. 



Partition 


Equivalence relation 


{{1,2,3}} 


{(1,1),(1,2),(1,3),(2,1),(2,2),(2,3),(3,1),(3,2),(3,3)} 


{{1},{2,3}} 


{(1,1), (2,2), (2,3), (3,2), (3,3)} 


{{2},{1,3}} 


{(1,1),(1,3),(2,2),(3,1),(3,3)} 


{{3},{1,2}} 


{(1,1), (1,2), (2,1), (2,2), (3,3)} 


{{1},{2},{3}} 


{(1,1),(2,2),(3,3)} 



Since partitions and equivalence relations amount to the same thing, we can 
use whichever is more convenient. 



4.4 Functions 

What is a function? This is a question that has given mathematicians a lot of 
trouble over the ages. People used to think that a function had to be given by a 
formula, such as x^ or sinjc. We don't require this any longer. All that is important 
is that you put in a value for the argument of the function, and out comes a value. 
Think of a function as a kind of black box: 




The name of the function is F; we put x into the black box and F{x) comes 
out. Be careful not to confuse F, the name written on the black box, with F{x), 
which is what comes out when x is put in. Sometimes the language makes it hard 
to keep this straight. For example, there is a function which, when you put in x, 
outputs x^. We tend to call this "the function x^", but it is really "the squaring 
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function", or "the function x i— > x^". (You see that we have a special symbol i— > to 
denote what the black box does.) 

Black boxes are not really mathematical notation, so we re-formulate this defi- 
nition in more mathematical terms. We have to define what we mean by a function 
F. Now there will be a set X of allowable inputs to the black box; X is called the 
domain of F. Similarly, there will be a set Y which contains all the possible out- 
puts; this is called the codomain of F. (We don't necessarily require that every 
value of Y can come out of the black box. For the squaring function, the domain 
and the codomain are both equal to M, even though none of the outputs can be 
negative.) 

The important thing is that every input x e X produces exactly one output 
y = F{x) e F. The ordered pair is a convenient way of saying that the input 
X produces the output y. Then we can take all the possible ordered pairs as a 
description of the function. Thus we come to the formal definition: 

Let X and Y be sets. Then a function from X to 7 is a subset F of 
X X 7 having the property that, for every x G X, there is exactly one 
element y eY such that {x,y) G F . We write this unique y as F{x) . We 
write F -.X ^Y (read "i^ from X to Y") to mean that F is a function 
with domain X and codomain Y. 

The set of all elements F{x), as x runs through X, is called the range of the 
function F. It is a subset of the codomain, but (as we remarked) it need not be the 
whole codomain. 

Here is an example. LetX = Y = {1,2,3,4,5}, and let 

F = {(1,1),(2,4),(3,5),(4,4),(5,1)}. 

Then F is a function from X to Y, with F(l) = 1, F(2) = 4, and so on. (In this 
particular case, it happens that F is given by a fairly simple formula: F{x) = 
6x-jc^ -4.) 

A function F : X — > F is called 

injective, or one-to-one, if different elements of X have different images 
under F: x\^ X2 implies F{x\) ^ F{x2) (or equivalently, F{x\) = F{x2) 
implies xi = X2). 

surjective, or onto, if its range is equal to F: that is, for every y G F, there is 
some X G X such that F(x) = y. 

bijective, or a one-to-one correspondence, if it is both injective and surjec- 
tive. 
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A bijective function from Z to F matches up the two sets: for each xeX there 
is a unique y = F{x) eY; and for each j e F there is a unique x such that 
F{x) — y. This can only happen if X and F have the same number of elements. 

If F is a bijective function fromX to F, then there is an inverse function G from 
F to X which takes every element y e F to the unique x e Z for which F{x) — y. 
In other words, the black box for G is the black box for F in reverse: 

X — G{y) if and only \iy — F{x). 

The inverse function G is also bijective. Thus a bijective function F and its inverse 
G satisfy 

• G{F{x)) =xforallxeX; 

• F(G(3;)) ^jforalljeF. 

Notice that F is the inverse function of G. 

Sometimes we represent a function F : A — > 5 by a picture, where we show 
the two sets A and 5, and draw an arrow from each element a of A to the element 
b = F{a) of B. For such a picture to show a function, each element of A must have 
exactly one arrow leaving it. Now 

• F is one-to-one (injective) if no point of B has two or more arrows entering 
it; 

• F is onto (surjective) if every point of B has at least one arrow entering it; 

• F is one-to-one and onto (bijective) if every point of B has exactly one arrow 
entering it; in this case, the arrows match up the points of A with the points 
of 5. 

Here are some illustrations. The first is not a function because some elements of 
A have more than one arrow leaving them while some have none. 
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4.5 Operations 

An operation is a special kind of function: its domain is X x X and its codomain 
is X, where X is a set. In other words, the input to the black box for F consists of 
a pair {x,y) of elements of X, and the output is a single element of X. So we can 
think of the function as "combining" the two inputs in some way. 

There is a different notation often used for operations. Rather than write the 
function as F, so that z — F{x,y) is the output when x and y are input, instead we 
choose a symbol like +, x,*,-=-,oor«, and place it between the two inputs: that 
is, we write x + y, or x x y, or . . . , instead of F (x, y) . This is called infix notation. 

Many of the operations we have already met (addition, subtraction, multipli- 
cation for numbers; addition and vector product for vectors; addition and multi- 
plication for matrices or polynomials; union and intersection for sets) are binary 
operations. 

An operation on a finite set can be represented by an operation table. This is 
a square table with elements of the set X labelling the rows and columns of the 
table. To calculate xo); (if o is the operation), we look in the row labelled x and 
column labelled y; the element in the table in this position is x o Here is an 
example: 



o 


a 


b 


c 


a 


a 


b 


c 


b 


b 


b 


c 


c 


c 


c 


c 



Given an operation, we can ask whether it satisfies the laws of algebra that we 
have met several times already. Consider the above example. 

Commutative? Yes, since the table is symmetric about the main diagonal, so xoy 
is always the same as y ox. 

Associative? Yes, though this is harder to show. You are invited either to prove it 
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by considering all cases of the associative law, or to find a nicer proof using 
a description of what the operation actually does. 

Identity? Yes, a is an identity, since 

aoa — a, aob — b, aoc = c. 

Inverse? No, there is no element x such that cox = a, since cox is always equal 
to c, whatever x is. 

4.6 Appendix: Relations and functions 

In this section we will see that, given an arbitrary function, we can turn it into 
a bijective function. If F : Z — > F is not onto, we can throw away the points of 
the codomain Y which are not in the range of F. Making it one-to-one is more 
difficult. The theorem below shows how to do it. 

Theorem 4.3 Let F -.X be a function. 

(a) The range ofF, the set {y E Y : y = F{x) for some xEX, is a subset B ofY. 

(b) Define a relation R onX by the rule that (xi,X2) G R if and only ifF{x\) = 
F{x2). Then R is an equivalence relation on X. 

(c) Let A be the set of equivalence classes of R. Then the function F .A ^ B 
defined by F{R{x)) — F{x)for allx e X, is a bijective function from A to B. 

Proof Part (a) is clear. Part (b) is quite easy: R is 

reflexive because F{x) = F(x) for all x e X; 

symmetric because F{xi) = F{x2) implies F{x2) = F{xi); 

transitive because F{xi) = F{x2) and F{x2) = F{x2) implies F{xi) = F(x3). 

Look at part (c). There is one important thing we have to do before we even 
have a function F: to show that it is well-defined. How could this go wrong? 
If xi and X2 are equivalent (that is, if (jci,X2) G R, then R{xi) — R{x2). What 
guarantee do we have that F{R{xi)) = F{R{x2)), as we need? This means that 
F(xi) = F{x2); but that is exactly the condition that ensures (xi,X2) E R. So F is 
a well-defined function. 

Is it one-to-one? Suppose that F{R{xi)) = F{R{x2)). Then by definition, 
F(xi) =F(x2); so (xi,X2) G R, soR{xi) =R{x2). 

Is it onto? Take any y EB. Since B is the range of F, there exists some x EX 
with F(x) = y. Then F{R{x)) =y,soy is in the range of F. 
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If that seems complicated, here is a picture. 

F 

X Y 



Ai» »yi 

'ji 

A3. 

A4* •y4 

^5" oys 



The five slabs on the left are the equivalence classes of the relation R; each 
point in the top slab is mapped by F to the same point F{x) on the right. The 

five points in the oval on the right make up the range of F. It is clear that equiva- 
lence classes on the left are matched up with points of the range on the right by a 
bijective function. 

In our earlier example, the equivalence classes of the relation R are {1,5}, 
{2,4} and {3}; the range of F is {1,4,5}; and the one-to-one correspondence F 
maps {1,5} to 1, {2,4} to 4, and {3} to 5. 

F F 
X Y A B 
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Exercises 

4.1 Which of the following relations R on sets X are (i) reflexive, (ii) symmetric, 
and (iii) transitive? For any relation which is an equivalence relation (that is, 
satisfies all three conditions), describe its equivalence classes. 



46 



CHAPTER 4. RELATIONS AND FUNCTIONS 



(a) X is the set of positive integers, R = {{x,y) : x+y = 100}. 

(b) X is the set of integers, R = {{x,y) : x = y}. 

(c) X is the set of railway stations in Great Britain, R is the set of pairs {x,y) of 
stations for which there is a scheduled direct train from xtoy. 

4.2 For each of the following functions F, describe the image of F, and state 
whether F is (i) one-to-one and (ii) onto: 

(a) F: {0, 1,2,3,4,5}^{0, l,2,3,4,5},F(;c) = [x/2\ (the greatest integer not 
exceeding x/2). 

(b) F:R^R, F{x) = e\ 

(c) F ■.R^R,F{x)=x'^+x. 

4.3 How many operations are there on the set {a,b} with two elements? How 
many of them satisfy (i) the associative law, (ii) the identity law? 

4.4 The Fundamental Theorem of Algebra says that a polynomial of degree n 
over the complex numbers has n complex roots. 

Define a "function" F : ^ by the rule that F (a, b) — (c, d) if c and d are 
the roots of the quadratic equation x^ + ax+b = 0. (So, for example, F(— 3,2) = 
(1,2).) 

Show that F is not in fact a function. 

Can you suggest a way to fix the definition? 



Chapter 5 

Division and Euclid's algorithm 



5.1 The division rule 

The division rule is the following property of natural numbers: 

Proposition 5.1 Let a and b be natural numbers, and assume that b > 0. Then 
there exist natural numbers q and r such that 

(a) a = bq + r; 

(b) 0<r<b-l. 

Moreover, q and r are unique. 

The numbers q and r are the quotient and remainder when a is divided by b. 
The last part of the proposition (about uniqueness) means that, if q' and / are 
another pair of natural numbers satisfying a = bq' + r' and < r' < b — I, then 
q = q' and r = r'. 

Proof We will show the uniqueness first. Let q' and r' be as above. If r = r', then 
bq = bq', so q — q' (as b > 0). So suppose that r ^ r'. We may suppose that r <r' 
(the case when r > r' is handled similarly). Then r' — r = b{q — q'). This number 
is both a multiple of b, and also in the range from 1 to Z? — 1 (since both r and r' 
are in the range from to — 1 and they are unequal). This is not possible. 

It remains to show that q and r exist. Consider the multiples of b: 0, b,2b, 

Eventually these become greater than a. (Certainly (a + 1)Z? is greater than a.) Let 
qb be the last multiple of b which is not greater than a. Then qb <a< {q+\)b. 
So < a — qb < b. Putting r = a — qb gives the result. 

Since q and r are uniquely determined by a and b, we write them as a div b and 
amodb respectively. So, for example, 37 div 5 = 7 and 37 mod 5 = 2. 
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The division rule is sometimes called the division algorithm. Most people 
understand the word "algorithm" to mean something like "computer program", 
but it really means a set of instructions which can be followed without any special 
knowledge or creativity and are guaranteed to lead to the result. A recipe is an 
algorithm for producing a meal. If I follow the recipe, I am sure to produce the 
meal. (But if I change things, for example by putting in too much chili powder, 
there is no guarantee about the result!) If I follow the recipe, and invite you to 
come and share the meal, I have to give you directions, which are an algorithm for 
getting from your house to mine. 

You learned in primary school an algorithm for long division which has been 
known and used for more than 3000 years. This algorithm is a set of instructions 
which, given two positive integers a and b, divides ahy b and finds the quotient q 
and remainder r satisfying a = bq + r and 0<r<b — l. 

5.2 Greatest common divisor and least common mul- 
tiple 

We write a\b to mean that a divides b, or & is a multiple of a. Warning: Don't 
confuse a \ b with a/b, which means a divided by b; this is the opposite way 
round! So a | is a relation on the natural numbers which holds if b = ac for some 
natural number c. 

Every natural number, including zero, divides 0. (This might seem odd, since 
we know that "you can't divide by zero"; but | means simply that there exists 
a number c such that = • c, which is certainly true. On the other hand, zero 
doesn't divide any natural number except zero. 

Let a and b be natural numbers. A common divisor of a and I? is a number d 
with the property that d \ a and d \ b. We call d the greatest common divisor if it is 
a common divisor, and if any other common divisor of a and b is smaller than d. 
Thus, the common divisors of 12 and 18 are 1, 2, 3 and 6; and the greatest of these 
is 6. We write gcd(12, 18) = 6. We write gcd as shorthand for "greatest common 
divisor". 

The remarks above about zero show that gcd(a, 0) — a holds for any non-zero 
number a. What about gcd(0,0)? Since every natural number divides zero, there 
is no greatest one. 

The number m is a common multiple of a and b if both a \ m and b \m. It 
is the least common multiple if it is a common multiple which is smaller than 
any other common multiple. Thus the least common multiple of 12 and 18 is 36 
(written lcm(12, 18) = 36). Any two natural numbers a and b have a least common 
multiple. For there certainly exist common multiples, for example ab; and any 



5.3. EUCLID'S ALGORITHM 



49 



non-empty set of natural numbers has a least element. (The least common multiple 
of and a is 0, for any a.) We write 1cm as shorthand for "least common multiple". 

Is it true that any two natural numbers have a greatest common divisor? We 
will see later that it is. Consider, for example, 8633 and 9167. Finding the gcd 
looks like a difficult job. But, if you know that 8633 = 89 x 97 and 9167 = 89 x 
103, and that all the factors are prime, you can easily see that gcd(8633, 9167) = 



But this is not an efficient way to find the gcd of two numbers. Factorising a 
number into its prime factors is notoriously difficult. In fact, it is the difficulty of 
this problem which keeps internet commercial transactions secure! 

EucUd discovered an efficient way to find the gcd of two numbers a long time 
ago. His method gives us much more information about the gcd as well. In the 
next section, we look at his method. 



Proof We saw already that gcd(a, 0) = a, so suppose that b>0. Let r = adi\b = 

a — bq, so that a = bq + r. If d divides a and b then it divides a — bq = r, and if 
d divides b and r then it divides bq + r = a. So the lists of common divisors of a 
and b, and common divisors of b and r, are the same, and the greatest elements of 
these lists are also the same. 

This is so slick that it doesn't tell us much. But looking more closely we see 
that it gives us an algorithm for calculating the gcd of a and b.lfb = 0, the answer 
is a.lfb> 0, calculate amodb = bi ; our task is reduced to finding gcd(Z7, b\), and 
b\ < b. Now repeat the procedure; of b\ = 0, the answer is b; otherwise calculate 
b2 = bmodbi, and our task is reduced to finding gcd{bi,b2), and Z?2 < ^i- At each 
step, the second number of the pair whose gcd we have to find gets smaller; so the 
process cannot continue for ever, and must stop at some point. It stops when we 
are finding gcd(Z7„_i,Z7„), with bn = 0; the answer is 

This is Euclid's Algorithm. Here it is more formally: 



89. 



5.3 Euclid's algorithm 




To find gcd{a,b) 

Put bo = a and bi — b. 

As long as the last number bn found is non-zero, put bn+i = mod 
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bn. 

When the last number bn is zero, then the gcd is . 

Example Find gcd(198,78). 
bo = 198, bi = 78. 
198 = 2-78 + 42, so &2 = 42. 
78 = 1-42 + 36, so Z73 = 36. 
42= 1-36 + 6, so Z74 = 6. 

36 = 6-6 + 0, so b5 = 0. 
So gcd(198,78) = 6. 

Exercise Use Euclid's algorithm to find gcd(8633,9167). 

5.4 Euclid's algorithm extended 

The calculations that allow us to find the greatest common divisor of two numbers 
also do more. 

Theorem 5.3 Let a and b be natural numbers, and d = gcd(a, b). Then there are 
integers x and y such that d = xa + yb. Moreover, x and y can be found from 
Euclid's algorithm. 

Proof The first, easy, case is when b = 0. Then gcd(a, 0) — a = 1 - a + - 0, so 
we can take x=l and y = 0. 

Now suppose that r = amodb, so that a = bq + r. We saw that gcd{a,b) = 
gcd(Z7, r) = d, say. Suppose that we can write d = ub + vr. Then we have 

d = ub + v{a — qb) = va + (w — qv)b, 

so d = xa-\-yb with x = v, y = w — ^v. 

Now, having run Euclid's algorithm, we can work back from the bottom to the 
top expressing 6^ as a combination of bi and for all i, finally reaching i — 0. 

To make this clear, look back at the example. We have 

42= 1-36 + 6, 6= 1-42-1-36 
78= 1-42 + 36, 6 = 1-42-1 -(78-42) = 2-42- 1-78 
198 = 2-78 + 42, 6 = 2 - (198 - 2 - 78) - 1 - 78 = 2- 198 -5 - 78. 

The final expression is 6 = 2 - 198 — 5 - 78. 
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We defined the greatest common divisor of a and b to be the largest natural 
number which divides both. Using the result of Euclid's extended algorithm, we 
can say a bit more: 

Proposition 5.4 The greatest common divisor of the natural numbers a and b is 
the natural number d with the properties 

(a) d\a andd \ b; 

(b) if e is a natural number satisfying e \ a and e \ b, then e \ d. 

Proof Let d — gcd{a,b). Certainly condition (a) holds. Now suppose that e is 
a natural number satisfying e \ a and e \ b. Euclid's algorithm gives us integers x 
and y such that d = xa + yb. Now e \ xa and e \ yb; so e \ xa + yb = d. 

Remarli Recall that, with our earlier definition, we had to admit that gcd(0,0) 
doesn't exist, since every natural number divides and there is no greatest one. 
But, with a — b — 0, there is a unique natural number satisfying the conclusion of 
Proposition 5.4, namely J = 0. So in fact this Proposition gives us a better way to 
define the greatest common divisor, which works for all pairs of natural numbers 
without exception! 

5.5 Polynomials 

Now we leave integers for a while, and turn to the set R[x] of all polynomials with 
real coefficients. 

There is also a version of the division rule and Euclid's algorithm for polyno- 
mials. The long division method for polynomials is similar to that for integers. 
Here is an example: Divide x^ + 4x^ — jc + 5byj:^ + 2x — 1. 

x'^ +2x -3 
x^ +2x -I ) x"^ +4p +5" 
x'^ +2x^ -x^ 

2p +x'^ —X 

2x^ +4.X- -2x 

-3x^ +5 
-3x^ -6x +3 
Ix +2 

This calculation shows that when we divide x"^ + 4x^ — x + 5 hy x^ + 2x— 1, the 
quotient is x^ + 2.)c — 3 and the remainder is Ix + 2. 

In general, let f{x) and g{x) be two polynomials, with g{x) ^ 0. Then the 
division rule produces a quotient q{x) and a remainder r(x) such that 
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• f{x)=g{x)q{x) + r{x); 

• either r{x) = or the degree of r{x) is smaller than the degree of g{x). 

(Remember that we didn't define the degree of the zero polynomial.) 

Let us prove that the division rule works. The proof follows the method that 
we used in the example: we multiply g{x) by a constant times a power of x so that, 
when we subtract it, the degree of the result is smaller than it was. Our proof will 
be by induction on the degree of f{x). 

So let f{x) and g{x) be polynomials, with g{x) ^ 0. 

Case 1: Either f{x) = 0, or deg(/(x)) < deg{g{x)). In this case we have 
nothing to do except to put q{x) = and r{x) = f{x). 

Case 2: deg(/(x)) > deg(g(x)). We let deg(/(x)) = n, and assume (as in- 
duction hypothesis) that the result is true if f{x) is replaced by a polynomial of 
degree less than n. Let 

f(x) = anjc" + l.d.t., 

g{x) = bmX^ + ldX., 

where we have used the abbreviation l.d.t. for "lower degree terms". We have 
a„ ^Q,bm^ 0, and (by the case assumption) n>m. Then 

{an/bm)x^-"' ■ g{x) = anx" + l.d.t., 

and so the polynomial f*{x) = f{x) — {un/bmjx"^'" - s{x) satisfies deg{f*{x)) < 
deg(/(x)): the subtraction cancels the leading term of f{x). So by the induction 
hypothesis, we have 

fix)=g{x)q*{x) + r*{x), 
where r*(x) = or deg(r*(jc)) < deg(g(jc)). Then 

f{x)=g{x) {{aJb^)x"-'-+g*{x)) +r*{x), 

so we can put g{x) = {an/bm)x"^'" + g* (x) and r(x) = r* (jc) to complete the proof. 

Having got a division rule for polynomials, we can now copy everything that 
we did for integers. Here is a summary of the definitions and results. 

A non-zero polynomial is called monic if its leading coefficient is 1, that is, if 
it has the form 

f{x) = x^ + a„-iX^~^ H \-aix + ao. 



5.5. POLYNOMIALS 



53 



We say that g{x) divides f{x) if f{x) = g{x)q{x) for some polynomial q{x); in 
other words, if the remainder in the division rule is zero. 

We define the greatest common divisor of two polynomials by the more ad- 
vanced definition that we met at the end of the last section. The greatest common 
divisor of f{x) and g{x) is a polynomial d{x) with the properties 

(a) d{x) divides f{x) and d{x) divides g{x); 

(b) if h{x) is any polynomial which divides both f{x) and g{x), then h{x) di- 
vides d{x); 

(c) d{x) is monic (if it is not the zero polynomial). 

The last condition is put in because, for any non-zero real number c, each of 
the polynomials f{x) and cf{x) divides the other; without this condition, the gcd 
would not be uniquely defined, since any non-zero constant multiple of it would 
work just as well. 

Theorem 5.5 (a) Any two polynomials f{x) and g{x) have a greatest common 
divisor. 

(b) The g.c.d. of two polynomials can be found by Euclid's algorithm. 

(c) If gcd{f{x) , g{x)) = d{x), then there exist polynomials h{x) and k{x) such 
that 

f{x)h{x)+g{x)k{x) = d{x)\ 

these two polynomials can also be found from the extended version of Eu- 
clid's algorithm. 

We will not prove this theorem in detail, since the proof is the same as that for 
integers. 

Exercises 

5.1 Find the greatest common divisor of 2047 and 2323, and write it in the form 
IQAlx + 2323y for some integers x and y. 

5.2 Find the least common multiple of 2047 and 2323. 

5.3 Find the greatest common divisor oix'^ — I and x^ + 3jc^ + 3, and write it 
in the form [x^ — 1)m(x) + (x^ + 3x^ 3)v(x) for some polynomials u{x) and 
v{x). 
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5.4 Prove that, for any two positive integers m and n, 

gcd(m, n) ■ lcm(m, n) — mn. 
Does any similar result hold for three positive integers? 
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Modular arithmetic is an important example of an algebraic system with only a 
finite number of elements, unlike most of our examples (the number systems, 
matrices, polynomials, etc.) which have infinitely many elements. 

6.1 Congruence mod m 

Here is a very important example of an equivalence relation. 

LetZ = Z, the set of integers. We define a relation on Z, called congruence 
mod m, where m is a positive integer, as follows: 

X =m y if and only if _y — jc is a multiple of m. 

You will often see this relation written as x = y (modm). The meaning is 
exactly the same. 

We check the conditions for an equivalence relation. 

reflexive: x — x = • m, so x =m x. 

symmetric: if x =m y, then 3; — x = cm for some integer c, so x — 3; = {—c)m, so 
y =m X. 

transitive: if x =m y and y =m z, then y — x = cm and z — y = dm, so z — x = 
(c + d)m, so X =m z. 

So =m is an equivalence relation. 

This means that the set of integers is partitioned into equivalence classes of the 
relation =„. These classes are called congruence classes mod m. We write [x]m 
for the congruence class mod m containing the integer x. (This is what we called 
R{x) in the Equivalence Relation Theorem, where R is the name of the relation; 
so we should really call it =m{x). But this looks a bit odd, so we say [x]^ instead. 
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For example, when m = 4, we have 



[0]4 
[1]4 
[2]4 
[3]4 



{■ 
{■ 
{■ 
{■ 



8,-4,0,4,8,12,...}, 
7,-3,1,5,9,13,...}, 
6,-2,2,6,10,14,...}, 
5,-1,3,7,11,15,...}, 



and then the pattern repeats: [4]4 is the same set as [0]4 (since =4 4). So there 
are just four equivalence classes. More generally: 

Proposition 6.1 The equivalence relation =m has exactly m equivalence classes, 
namely [0]«, [1]«, [l]m, . . . , [m - 1]«. 

Proof Given any integer n, we can divide it by m to get a quotient q and remain- 
der r, so that n = mq + r and < r < m — \. Then n — r = mq, so r n, and 
n e [r]m. So every integer lies in one of the classes in the proposition. These 
classes are all different, since if /, j both lie in the range 0, ... ,m — 1, then j — i 
cannot be a multiple of m unless / = j. 

We use this in everyday life. Consider time on the 24-hour clock, for example. 
What is the time if 298 hours have passed since midnight on 1 January this year? 
Since two events occur at the same time of day if their times are congruent mod 24, 
we see that the time is [298]24 = [10]24, that is, 10:00, or 10am in the morning. 

6.2 Operations on congruence classes 

Now we can add and multiply congruence classes as follows: 



Look carefully at these supposed definitions. First, notice that the symbols for 
addition and multiplication on the left are the things being defined; on the right 
we take the ordinary addition and multiplication of integers. 

The second important thing is that we have to do some work to show that we 
have defined anything at all. Suppose that — {a'\,n 

and [b\m = [b']m. What 

guarantee have we that [a + d]m=[b + &']m? If this is not true, then our definition 
is worthless; so let's try to prove it. We have 



[a]m + [b] 
[a\m ■ [b] 



m 



m 



[a + b] 
[ab]m- 




cm, and 
dm; so 

{c-\-d)m, 
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so indeed a' + b' =m a + b. Similarly, with the same assumption, 

a'b' —ab = (cm + a) (dm + b) —ab 
= m{cdm + cm + ad) 

so a'b' =fn ab. So our definition is valid. 

For example, here are "addition table" and "multiplication table" for the in- 
tegers mod 4. I have been lazy and written 0, 1,2,3 instead of the correct forms 

[0]4,[1]4,[2]4,[3]4. 
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We denote the set of congruence classes mod m, with these operations of ad- 
dition and multiplication, by Z^. Note that is a set with m elements. We call 
the operations "addition and multiplication mod m". 

Theorem 6.2 The set Z„, with addition and multiplication mod m, satisfies the 
commutative, associative, and identity laws for both addition and multiplication, 
the inverse law for addition, and the distributive law. 

Proof We won't prove the whole thing; here is a proof of the distributive law. 
We are trying to prove that 

[a\mi[b]m+[c]m) = [a]m[b]m + [a]m[c]m- 

The left-hand side is equal to [a]m[^ + c]m (by the definition of addition mod m), 
which in turn is equal to [a{b + c)]m (by the definition of multiplication mod m. 
Similarly the right-hand side is equal to [a&],„-|- [ac]^, which is equal to [ab + ac]m- 
Now a{b + c) = ab + ac, by the distributive law for integers; so the two sides are 
equal. 

6.3 Inverses 

What about multiplicative inverses? Not every element in Z^ has an inverse. For 
example, [2] 4 has no inverse; if you look at row 2 of the multiplication table for 
Z4, you see that it contains only the entries and 2, so there is no element [b]4 
such that [2]4[&]4 = [1]4. On the other hand, in Z5, every non-zero element has an 
inverse, since 



[1]5[1]5 = [1]5, [2]5[3]5 = [1]5, [4]5[4]5 = [1]5. 
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Theorem 6.3 The element [a]rn ofZrn has an inverse if and only ifgcd{a,m) = 1. 

Proof We have two things to prove: if gcd(a,m) = 1, then [a]m has an inverse; if 
[a\m has an inverse, then gcd(a,m) = 1. 

Suppose that gcd(fl,m) = 1. As we saw in the last chapter, Euclid's algorithm 
shows that there exist integers x and y such that xa + ym = 1. This says that 
xa — I =ymis a multiple of m, so that xa =m 1- This means [jc];„[a]OT = [1]^, so 
[x]m is the inverse of [aj^- 

Now suppose that [x]m is the inverse of [a]m, so that xa =m 1. This means that 
xa+ym = 1 for some integer y. Now let d = gcd(a,m). Then d \ xa and d \ ym, so 
d \ xa+ym = I; so we must have J = 1 , as required. 

Corollary 6.4 Suppose that p is a prime number Then the multiplicative inverse 
law holds in Z^; that is, every non-zero element ofLp has an inverse. 

Proof If p is prime, then every number a with \ <a<p satisfies gcd(a, p) = 1. 
(For the gcd divides p, so can only be 1 or p; but p clearly doesn't divide a.) Then 
the Theorem implies that [a]p has an inverse in Zp. 

6.4 Fermat's Little Theorem 

We already met Fermat, whose "Last Theorem" gave mathematicians so much 
trouble for so many years. In this section, we will prove a theorem which Fermat 
did succeed in establishing. First, two results about Zp for p prime. 

Lemma 6.5 Let pbe a prime number and suppose that p \ ab. Then p \ a or p \ b. 

Proof Suppose that p divides ab but p does not divide a. Since p is prime, we see 
thatgcd(fl,p) = 1. By Euclid's algorithm, there exists and y such that xa+yp = 1. 
Then xab + ypb = b. Now p divides xab (since it divides ab, and clearly p divides 
ypb; so p divides b. 

Lemma 6.6 Let p be a prime number 

(a) If[a]p[b]p = [0]p, then either [a]p = [0]p or [b]p = [0]p. 

(b) If[ab]p = [ac]p and [a]p ^ [0]^, then \b\p = [c]p. 

Proof (a) Since = [ab]p, the assumption [(3]p[Z7]p = [0]p means that ab =p 

0, that is, p I ab. Then p \ a or p \ b hy the preceding Lemma; so [a]p — [0]p or 
[b]p = [0]p. 

(b) We have [a]p[b — c]p = [0]p; so, if [a]p ^ [0]p, then [b — c]p — [0]p, so that 
[b]p = [c]p. 
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So we come to Fermat's Little Theorem: 

Theorem 6.7 Let p be a prime number If a is any integer not divisible by p, then 
aP-^ =p 1. 

So, for example, 3^ =7 1, as you can check. 

Proof Consider the non-zero elements [1]^, [2]p, . . . , [p — \]p. Multiply them all 

by a, to get [a]p, [2a]p, . . . , [(p — \ )a\p. Now by the preceding Lemma, none of 
these elements is equal to [0]p, and no two of them are equal; so we have the same 
list of elements in a different order. So their product is the same: 

[a]p[2a\p- --[{p- \)a\p = [l]p[2\p- --[p- l]p, 

from which we see that 

[aP-%[l]p[2]p...[p-l]p^[l]p[2]p...[p-l]p. 

Since [l]p[2]p • • • [p — l]p 7^ [0]^, we conclude from the lemma that 

= [l]p, 

in other words, aP~^ =p 1, as required. 

For example, if p = 7 and a = 3, then the multiples of 3 mod 7 are 

[3]7, [6]7, [9]7 = [2]7, [12]7 = [5]7, [15]7 = [1]7, [18]7 = [4]7, 

so we do obtain all the non-zero congruence classes in a different order. 

Exercises 

6.1 Find the units in Z30 and their inverses. 

2 3 

6.2 Calculate ^ + 4 ^29- 

6.3 Solve the quadratic equation + 2x + 2 = 

(a) inZi7, 

(b) in Z19. 

6.4 Prove that (p — 1)! =p — 1 if and only if p is prime. (This is Wilson's Theo- 
rem.) 
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Chapter 7 

Polynomials revisited 



We will look at three further aspects of polynomials. First, we have only consid- 
ered real polynomials so far, but this can be generalised: as long as we can add 
and multiply the coefficients, we can do the same with polynomials. Second, we 
look at factorisation and show that, under the right conditions, the division rule, 
Euclid's algorithm, and the Remainder and Factor Theorems hold. Finally, the 
construction of the integers mod m by means of congruence classes can be ex- 
tended to polynomials. This gives us a less ad hoc construction of the complex 
numbers, as well as other finite systems having addition and multiplication. 

7.1 Polynomials over other systems 

Let Rhe a set on which two operations (called addition and multiplication are 
defined. Suppose that R satisfies the following laws. (We call this collection of 
laws CRI). 

• the commutative, associative, identity and inverse laws for addition (the 
identity for addition is called 0, and the inverse of a is —a); 

• the commutative, associative, and identity laws for multipUcation (the iden- 
tity for multiplication is called 1); 

• the distributive law. 

Later, we will study such systems formally under the name "commutative rings 
with identity". In this section, we will put them to use. 
The examples we have met already include: 

• Z, Q, R, C; 

• R[x], the polynomials with real coefficients; 
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• Z^, the integers mod m. 

We can define a polynomial over R to be an expression of the form 

f{x) = Unx" + an-ix"^^ -\ {-aiX + UQ, 

where n is a non-negative integer and a„, , . . . , ai , R- We adopt the same 
rules as we discussed earlier for when two expressions represent the same polyno- 
mial (we can insert or remove terms with coefficient zero, and we can replace Ix" 
by jc"). Now we can add and multiply polynomials by the same rules as before. 
Let R[x\ be the set of all polynomials with coefficients in R. 

Proposition 7.1 IfR satisfies the laws ( CRI) above, then so does R[x]. 

Proof As usual, we don't give a detailed verification of all the laws. The ap- 
pendix to this chapter gives some of the details. 

We end this section with a warning. We informally defined a real polynomial 
to be a function on the real numbers given by an expression of the right form. This 
no longer works for more general polynomials. 

Example Let R = Z2, the integers mood 2. The set R contains two elements, 
[0]2 and [1]2, which we will write more briefly as and 1. The laws (CRI) are 
satisfied. 

Now consider the two polynomials x and x^. Since 0^ = and 1^ = 1, these 
two polynomials give rise to the same function on Z2. However, we really do want 
to regard them as different polynomials ! Hence we regard polynomials as being 
formal expressions, not the functions they define. 

7.2 Division and factorisation 

The division rule and Euclid's algorithm work in almost the same way for poly- 
nomials as for integers. So we can mimic the definition of the integers mod m. 

We need one more property for the coefficients, beyond the laws (CRI) we 
assumed before. The extra law is the inverse law for multiplication, which states 
that every element aofR except has a multiplicative inverse a^K A system sat- 
isfying (CRI) and the inverse law for multiplication is called afield. The examples 
we know so far are Q, R, C, and Zp for prime numbers p. 

[Why do we need this extra law? Look back at the proof of the division rule 

for polynomials. To divide f{x) = a„x" H by g{x) = bmx"^ H , where Z?,,, ^ 

and n> m, we first subtract {an/bm)x"^'^g{x) from f{x) to obtain a polynomial 
of smaller degree. So we need to be able to divide a„ by bm, that is, we need a 
multiplicative inverse for bm-] 
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Theorem 7.2 IfR is afield, then the set R[x] of polynomials with coefficients in R 
satisfies the division rule, and Euclid's algorithm works in R[x]. 

A polynomial g{x) of degree greater than zero is called irreducible if it cannot 
be written in the form g{x) = h{x)k{x), where h{x) and k{x) are polynomials with 
degrees smaller than the degree of g{x). 

We will not treat irreducible polynomials in detail, but simply look at one 
technique for recognising them. Let f{x) be a polynomial over the field R. For 
any c G i?, we let /(c) denote the result of "substituting c into fix)"; that is, 

if f{x) = anx" H h flO) then /(c) = fl„c" H h ao- 

The next theorem combines two familiar results about polynomials, the Re- 
mainder Theorem and the Factor Theorem. Notice that, if we divide f{x) by a 
polynomial of degree 1, the remainder has degree zero, that is, it is a constant 
polynomial (which we regard as an element of R). 

Theorem 7.3 Let f{x) be a polynomial over afield R, and c e i?. 

(a) The remainder when f{x) is divided by x — c is fie). 

(b) f{x) is divisible by x — c if and only if f{c) = 0. 

Proof (a) Write f{x) — (x — c)q{x) + r, where r is a constant polynomial. Sub- 
stituting c into this equation we find /(c) = r. 

(b) If /(c) = then /(x) = {x — c)q{x),sox — c divides f{x). The converse is 
clear from the uniqueness of the remainder in the division rule. 

Example The polynomial f{x) = x-^ — 2 is irreducible in Q[x]. For if it fac- 
torises, it must be a product of polynomials of degrees 1 and 2. The polynomial of 
degree 1 has the form x — c, where c is a rational number; by the Factor Theorem, 
/(c) = 0, that is, c^ = 2. But, following Euclid's proof, it can be shown that 
is irrational (this is an exercise for you); so this is impossible. 

7.3 "Modular arithmetic" for polynomials 

Now let R he a field, and let g{x) be a fixed non-zero polynomial in R[x]. To 
make things easier, we assume that g{x) is monic. We say that two polynomials 
fi{x) and fiix) are congruent mod g{x) if g{x) divides fi{x) — fiix), that is, if 
f\{x) = g{x)h{x) +f2{x) for some polynomial h{x). 
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Proposition 7.4 Congruence modg{x) is an equivalence relation, and each equiv- 
alence class contains a unique polynomial r{x) such that r{x) = or deg(r(x)) < 
deg(g(x)). 

This is just a re-statement of the division rule. We will denote the equivalence 

class of f{x) by [/(^)], and call it a congruence class mod g{x). 

Let E be the set of congruence classes mod g{x). Just as we did for congru- 
ence classes mod m in the integers, we are going to give rules for adding and 
multiplying elements of E. The rules are the obvious ones: 

• [/lW] + [/2W] = [/lW+/2W], 

• [/lW]-[/2W] = [/lW/2W]. 

Just as before, we have first to do some work to show that our definition is 
a good one. That is, if fi{x) = f[{x) and /2(x) = f^{x), then +/2(x) = 
f[{x) + f2{x) and fi{x)f2{x) = f[{x)f2{x). (All congruences are modulo g{x).) 
Here is the proof of the first statement; try the second for yourself. We are given 
that /i {x) — f[ (x) = g{x)hi (x) and fiix) — f2{x) = g{x)h2{x). Then we find 

(/l(x) +/2(x)) - = g{x){hi{x)+h2{x)), 

which shows the required congruence. 

Proposition 7.5 IfR is afield, then the setE of congruence classes modg{x) also 
satisfies ( CRI). 

The proof of this simply involves routine checking of laws. 
For integers, we saw that is a field if p is prime. Something very similar 
happens here; in place of primes, we use irreducible polynomials. 

Theorem 7.6 Suppose thatR is afield and g{x) an irreducible polynomial in R[x\. 
Then the set E of equivalence classes mod g{x) is also afield, and contains the 
field R. 

Proof We have to show that a non-zero congruence class has a multiplicative 
inverse. 

Suppose that the equivalence class [f{x)\ is not zero. This means that g{x) 
doesn't divide f{x). So gcd(/(x),g(x)) = 1. (For the gcd is a monic polynomial 
dividing g{x)\ since g{x) is irreducible, it cannot have positive degree.) 

By Euclid's algorithm, there are polynomials h{x) and k{x) such that 



f{x)h{x)+g{x)k{x) = \. 
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This equation says that f{x)h{x) = 1 mod g{x), so [/(x)] • = [1]. Thus we 
have found an inverse for [/(x)]. 

To find a copy of the field R inside E, we just take the equivalence classes of 
the constant polynomials; they add and multiply just like elements of R: 

[c] + [d] = [c + d], [c]-[d] = [cd], 

This is all very good, but a bit too abstract for practical use. Here is a descrip- 
tion which is easier to calculate with. 

Proposition 7.7 Suppose that the hypotheses of the previous theorem are satis- 
fied, and let m be the degree of g{x). Then E is the set 

{cm-icf^ -\ VcQ : co,...,Cm-i eR}, 

where a is a new symbol satisfying g{oc) ~ 0. 

Proof We saw that the constant polynomials [c] are just like elements of R, so 
we ignore the difference and identify them with elements of R. Let oc = [jc], the 
congruence class containing the polynomial x. Now we saw that each equivalence 
class contains a unique polynomial r(x) of degree less than m (or zero). If r(x) = 
Cm-ijd"~^-\ h CO, then 

[r{x)] = [c„_ix'"-l + --- + co] 

= [c„_i]M-i + ... + [co] 
= Cm-ia"'~^ + --- + co. 

(In the second line we used the rules for adding and multiplying equivalence 
classes; in the third, we put [c] = c and [jc] = 05. 

Finally, g{x) =0 mod g{x), so [g{x)] = [0]. By the same argument, this gives 
g{a) = 0. 

Time for a (very important) example. Let R be the field M of real numbers. We 
take g{x) to be the polynomial x^+l. (This is irreducible; for its factors, if any, 
would have degree 1, but if, say, 

x^ +1 — {x — a){x — b), 

then a^ = — I, which is impossible since the square of any real number is positive. 

Now the field E of our construction consists of all expressions of the form 
c + da, where c and d are real numbers and a is a new symbol satisfying a^ + l = 
0. Thus a is the symbol usually called i. So the complex numbers are not just a 
fluke; they are a special case of a very general construction! 
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7.4 Finite fields 

We saw that is a field with m elements if m is prime, but is not a field if m is 
composite. Is there a finite field with four elements? We cannot use Z4, since [2)4 
has no multipUcative inverse. 

We apply the construction of the preceding section. First we find an irreducible 
polynomial. 

Lemma 7.8 The polynomial x^ + x+l is irreducible over Z2. 

Proof If not, it has a factor of the form x — c for some c E Z2. By the Factor 
Theorem, this would mean that /(c) = 0. But, writing and 1 instead of the more 
cumbersome [0]2 and [1]2, 

02 + 0+1 = 1, 
12 + 1 + 1 = 1, 

so no c satisfying c2 + c + 1 = exists, and there is no factor (x — c). 

So there is a field E consisting of the elements ca + d, with c, J G Z2 and 
+ a + I =0. The elements of £■ are 0, 1, a, oc + 1. The elements and 1 

comprise the field Z2, so that 1 + 1 = 0. Thenx+x = (1 + l)x = for any x, and 

soa2 = — a— l = a+ l. Then we can do calculations like 

{a+\)^ = a^ + a + a+ l = a+ l + l = a. 



In general, any expression involving a can be calculated to be one of the four 
elements 0, 1, a, a + 1. 

The addition and multiplication tables for the field E can now be worked out: 
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1 


a 



Now the multiplicative inverses of 1, a and a+l are, respectively, 1, a + l, 
and a. 

Evariste Galois is one of the founders of modern algebra. He was killed in a 
duel at the age of 19; already he had worked out, and published, the construction 
of finite fields (he did much more than we have seen, showing that the number 
of elements in a finite field must be a power of a prime, and that for each prime 
power there is a unique finite field of that order). Finite fields are called Galois 
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fields in his honour; the field with q elements is denoted by GF(^). So the field 
we constructed above is GF(4). 

But his major work, in which he showed how group theory could be used 
to decide when a "solution by radicals" for a general polynomial equation could 
be found, had been lost by referees at the French Academy (who were probably 
unable to understand it). Its main impact came fifteen years later when it was 
rediscovered and published. 

As well as algebra, Galois was 

deeply involved in the revolutionary 

politics of his time. The duel in 

which he was shot and killed was 

apparently over a woman; but 

historians have uncovered evidence 

that it had been set up, either by the 

RoyaUst police, or by the 

revolutionaries to whom he had 

offered himself as a sacrifice to 

spark a general uprising. If the 

second explanation is true, then he 

died in vain, as there was no 

uprising. 




7.5 Appendix: Laws for polynomials 

In Proposition 7.1, we asserted that, if R satisfies the system (CRI) of laws, then 
so does R[x] , the set of polynomials over R. In this appendix I will say a few words 
about the proof. First, let us be clear about the definitions. 
A polynomial over is an expression 

n 

f{x) = Unx'^ H h a\x + ao = ^ a/x'. 

Suppose that g{x) is another polynomial: 

m 

g{x) = bm^ + . . . + = £ ^/JC'. 

!=0 

To add f{x) and g{x), we first assume that m = n. (If m < n, we add extra terms 
0.x:' for z = m + 1 , . . . , n to the polynomial g{x), and similarly if n < m we add zero 
terms to f{x). Then 

{f + g){x)^f^{at + bi)x\ 

i=0 
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The rule for multiplication is a bit more complicated: 

m+n 

{f8){x)=l^Cix', 

i=0 

where 

Ci = Y,^jbi-j, 

the sum being over all values of j for which both aj and bj-j are defined; that is, 
we have < j <n and < / — 7 < m, so that i — m<j< i. We can summarise 
these two sets of conditions by saying 

max(0, i — m) < j < min(/, n) . 

Consider, for example, the distributive law 

{f + 8)h = fh + gh. 

We assume that / and g are as above (with m — n) and that 

p 

h{x) = ^dix'. 
i=o 

Then the coefficient of x' in {f + g)h is 

+ bj)di-j = Y,^jdi-j + bjdi-j, 

using the distributive law for R; and the coefficient in fh + gh is 

Y,^jdi-j + Y,bjdi-j. 

Now all the sums are over the same range max(0,j — m) < j < min(/,p), and 
rearranging the terms shows that the two expressions are equal. 

Exercises 

7.1 Suppose that ax + b divides Cnx" H \-cq in Z[x], where a, Z?, cq, . . . , c„ are 

integers. Show that a divides c„, and b divides cq. Hence show that jc" — 2 is 
irreducible over Z for any positive integer n. 

7.2 (a) Show that the polynomial + 1 is irreducible over Z3. 
(b) Construct a field with nine elements. 

7.3 Verify the associative law for multiplication of polynomials. 



Chapter 8 
Rings 



We have seen many different types of structure (numbers, matrices, polynomials, 
sets, modular arithmetic) which satisfy very similar laws. Now we take the ob- 
vious next step: we consider systems satisfying these laws abstractly, and prove 
things about them directly from the laws they satisfy. The results will then be true 
in our systems no matter what they are made up of. This is called the axiomatic 
method. 

8.1 Rings 

A ring is a set R of elements with two operations, addition (written +) and multi- 
plication (written ■ or just by juxtaposing the factors) which satisfies the following 
laws. (Most of these we have seen before, but we state them all formally here.) 
Additive laws: 

(AO) Closure law: For al a, & e i?, we have a + b eR. 

(Al) Associative law: For all a,b,c & R, we have a + {b + c) — {a + b)-\-c. 

(AT) Zero law: There is an element OeR with the property that a + = + a = a 
for all a^R. 

(A3) Additive inverse law: For all aeR, there exists an element b eR such that 
a + b = b + a = 0. We write b as —a. 

(A4) Commutative law: For all a,b E R, we have a + b = b + a. 

Multiplicative laws: 

(MO) Closure law: For al a, & e i?, we have ab e R. 
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(Ml) Associative law: For all a,b,c e R, we have a{bc) = {ab)c. 
Mixed laws: 

D Distributive laws: ¥ov a\\a,b,c e R, we have a{b + c) = ab+ac and {b + c)a = 
ba + ca. 

Before we go further, a couple of comments: 

• The closure laws are new. Strictly speaking, they are not necessary, since to 
say that + is an operation on R means that the output a + b when we input 

a and b to the black box belongs to R. We put them in as a reminder that, 
when we are checking that something is a ring, we have to be sure that this 
holds. 

• We have stated the identity and inverse laws for addition in a more compli- 
cated way than necessary. Since we are going on to state the commutative 
law for addition, we could simply have said that a + O — a and a + ( —a) — 0. 
We'll see the reason soon. 

We have already seen that sometimes the multiplication satisfies further laws, 
which resemble the laws for addition. This won't always be the case, so we give 
special names to rings in which these laws hold. 

Let Rhea ring. We say that is a ring with identity if 

(M2) Identity law: There is an element I eR (with 1 7^ 0) such that \a = a\=a 
for all aeR. 

We say that 7? is a division ring if it satisfies (M2) and also 

(M3) Multiplicative inverse law: for all a G i?, if a 7^ 0, then there exists b & R 
such that ab — ba= 1. We write b asa~^. 

We say that is a commutative ring if 

(M4) Commutative law: for all a,b eR, we have ab = ba. 

(Note that the word "commutative" here refers to the multiplication; the addition 
in a ring is always commutative.) Finally, we say that R is a field if it satisfies 
(M2), (M3) and (M4). 

The condition (CRl) which we introduced in the last chapter thus stands for 
"commutative ring with identity". 

In a non-commutative ring, we need to assume both parts of the identity and 
multiplicative inverse laws, since one does not follow from the other. Similarly, 
we do need both parts of the distributive law. 
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8.2 Examples of rings 

We have a ready-made stock of examples: 

• Z is a commutative ring with identity. 

• Q, M and C are fields. 

• If /? is a commutative ring with identity, then so is the polynomial ring R[x]. 

• If 7? is a ring, then so is the set M„ x n (R) of all n x n matrices over R (with the 
usual definitions of matrix addition and multiplication). If R has an identity, 
then so does MnxniR)- But this ring is usually not commutative. 

• For any m, the set Z,„ of integers mod m is a commutative ring with identity. 
It is a field if and only if m is prime. 

• If i? is a field and g{x) a monic polynomial of degree at least 1 over R, then 
the set of congruence classes of polynomals mod g{x) is a commutative ring 
with identity. It is a field if (and only if) g{x) is an irreducible polynomial. 

Note that the third and fourth of these constructions (polynomials and matri- 
ces) are methods of building new rings from old ones. You may guess that the 
fifth and sixth can also be made into constructions of new rings from old. This is 
correct, but the construction is beyond the scope of this course. You will meet it 
next year in Algebraic Structures I. 

Some other familiar structures do not form rings. For example, the set of 
natural numbers N is not a ring, since the additive inverse law does not hold. 

At the end of the last chapter, we constructed a field with four elements. 

8.3 Properties of rings 

We now give a few properties of rings. Since we only use the ring axioms in the 
proofs, and not any special properties of the elements, these are valid for all rings. 
This is the advantage of the axiomatic method. 

Proposition 8.1 In a ring R, 
there is a unique zero element; 
any element has a unique additive inverse. 
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Proof (a) Suppose that z and z' are two zero elements. This means that, for any 
aeR, 

a + z = z + a = a, 
a + z' = z! + a = a. 

Now we have z + z' = z' (putting a = z' in the first equation) and z + z! = z (putting 
a = z in the second). So z — z! . 

This justifies us in calling the unique zero element 0. 

(b) Suppose that b and b' are both additive inverses of a. This means that 

a-\-b = b-\-a = Q, 
a + b' = b' + a = 0. 

Hence 

b = b + = b+(a + b') = (b + a) + b' = + b' = b'. 

(Here the first and last equalities hold because is the zero element; the second 
and second last are our assumptions about b and b'; and the middle equaUty is the 
associative law. 

This justifies our use of —a for the unique inverse of a. 

Proposition 8.2 Let Rbea ring. 

(a) IfR has an identity, then this identity is unique. 

(b) Ifa&R has a multiplicative inverse, then this inverse is unique. 

The proof is almost identical to that of the previous proposition, and is left as 

an exercise. 

The next result is called the cancellation law. 

Proposition 8.3 Let Rbe a ring. Ifa + b = a + c, then b = c. 
Proof 

b = 0+b— {—a + a)+b = —a + {a + b) = —a + {a + c) = {—a + a) + c = + c = c. 

Here the third and fifth equalities use the associative law, and the fourth is what 
we are given. To see where this proof comes from, start with a + b — a + c, then 
add —a to each side and work each expression down using the associative, inverse 
and zero laws. 
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Remark Try to prove that, if is a field and a^O, then ab = ac impUes b = c. 

The next result is something you might have expected to find amongst our 
basic laws. But it is not needed there, since we can prove it! 

Proposition 8.4 Let Rbe a ring. For any element a&R,we have Oa = aO = 0. 

Proof We have + = 0, since is the zero element. Multiply both sides by a: 

aO + aO = a(0 + 0) =aO = aO + 0, 

where the last equality uses the zero law again. Now from aO + aO = aO + 0, we 
get flO = by the cancellation law. The other part Oa = is proved similarly; try 
it yourself. 

There is one more fact we need. This fact uses only the associative law in its 
proof, so it holds for both addition and multiplication. To state it, we take o to be 
a binary operation on a set X, which satisfies the associative law. That is, 

ao{boc) = {aob)oc 

for all a, & , c e Z. This means that we can write aoboc without ambiguity. 

What about applying the operation to four elements? We have to put in brack- 
ets to specify the order in which the operation is appUed. There are five possibiU- 
ties: 

ao{bo (cod)) 
ao ((boc) od) 
{aob)o (cod) 
(ao (boc)) od 
{{aob) oc)od 

Now the first and second are equal, since bo [cod) = (boc) od. Similarly the 
fourth and fifth are equal. Consider the third expression. If we put x = aob, 
then this expression is x o (c o J), which is equal to (xoc) o d, which is the last 
expression. Similarly, putting y = c o J, we find it is equal to the first. So all five 
are equal. 

This result generalises: 

Proposition 8.5 Let o be an operation on a set X which satisfies the associative 
law. Then the value of the expression 

a\o a20 ■ ■ ■ o an 

is the same, whatever (legal) way n — 2 pairs of brackets are inserted. 

I won't give the inductive proof here; you are encouraged to try it yourself! 
You will find the proof in an appendix to the notes. 
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8.4 Units 

Let i? be a ring with identity element 1 . An element u E Ris called a unit if there 
is an element v E R such that uv — vu— 1. The element v is called the inverse of 
u, written u~^. By Proposition 8.2, a unit has a unique inverse. 
Here are some properties of units. 

Proposition 8.6 Let R be a ring with identity. 

(a) is not a unit. 

(b) I is a unit; its inverse is 1. 

(c) If u is a unit, then so is u~^; its inverse is u. 

(d) Ifu and v are units, then so is uv; its inverse is v~^u~^. 



Proof (a) Since Ov = for all v e and 7^ 1, there is no element v such that 
Ov= 1. 

(b) The equation 11 = 1 shows that 1 is the inverse of 1. 

(c) The equation u~^u = uu"^ = 1, which holds because u'^ is the inverse of 
u, also shows that u is the inverse of 

(d) Suppose that u~^ and are the inverses of u and v. Then 

{uv){v^^u~^) = u{vv~^)u~^ = ulu~^ = uu~^ = I, 
(v~^m~^)(mv) = v~^{u~^u)v = v~^lv = v~^v = I, 

so v~^ is the inverse of uv. 

Here is how Hermann Weyl explains Proposition refunits(d), the statement 
that (mv)~^ = v~^m~\ in his book Symmetry, published by Princeton University 
Press. 

With this rule, although perhaps not with its 

mathematical expression, you are all familiar. When 
you dress, it is not immaterial in which order you 
perform the operations; and when in dressing you 
start with the shirt and end up with the coat, then in 
undressing you observe the opposite order; first take 
off the coat and the shirt comes last. 

Here are some examples of units in familiar rings. 
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• In a field, every non-zero element is a unit. 

• In Z, the only units are 1 and —1. 

• Let F be a field. Then a polynomial in the polynomial ring F[x] is a unit if 
and only if it is a non-zero constant polynomial. For we have 



so if f{x)g{x) = 1 then f{x) must have degree zero, that is, it is a constant 
polynomial. 

• Let F be a field and n a positive integer. An element A of the ring M„xn (F) is 
a unit if and only if the determinant of A is non-zero. In particular, | ^ ] 



is a unit in M2x2{^) if and only if ad — bc^ 0; if this holds, then its inverse 
is 



• Which elements are units in the ring of integers mod m? The next result 
gives the answer. 

Proposition 8.7 Suppose thatm > 1. 

(a) An element [a]m ofLm is a unit if and only ifgcd{a,m) — 1. 

(b) Ifgcd{a,m) > 1, then there exists b such that [a]m[b]m = [0]m- 

Proof Suppose that gcd(a, m) = 1 ; we show that a is a unit. By Euclid, there exist 
integers and J such that ax + my — 1. This means ax 1, so that [aj^Mm = [l]m> 
and [a\m is a unit. 

Now suppose that gcd(fl,m) = d> I. Then a/d and m/d are integers, and we 
have 



so [fl]m[^]m = [0]m> whcrc b = m/d. Since < & < m, we have [b]m 7^ [0]^. 
But this equation shows that a cannot be a unit. For, if [xj^fa]^ = then 



deg(/Wg(x)) = deg(/W) +degU(x)), 






[b]m = Wm[b]m = [x]m[a\m[b]m = H/«[0]m = [0] 



a contradiction. 
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Example The table shows, for each non-zero element [a] 12 of Z12, an element 
[b]i2 such that the product is either 1 or 1. To save space we write a instead of 

[«]l2- 

123456789 10 11 

1 1 = 1 2-6 = 3-4 = 4-3 = 5-5 = 1 6-2 = 7-7 = 1 8-3 = 9-4 = lO<Scdol6 = 11-11 = 1 
^ X X X y X y X X X y 

So the units in Z12 are [l]i2, [5] 12, [7] 12, and [ll]i2- 

Euler's function 0(m), sometimes called Euler's totient function, is defined to 
be the number of integers a satisfying < a < m — 1 and gcd(a,m) = 1. Thus 
(m) is the number of units in Z^. 

8.5 Appendix: The associative law 

In this section we give the proof that, if o is an operation on a set X which satisfies 
the associative law, then the composition of n terms doesn't depend on how we 
put in the brackets (Proposition 8.5). 

The proof is by induction on n. For n = 2, there are no brackets in ai o a2, 
and nothing to prove. For n = 3, there ae two ways to put in the brackets, viz. 
ai o (a2 o (33) and (ai o o a^; the associative law asserts that they are equal. In 
the notes we saw that, for n = 4, there are five bracketings, and the five expressions 
are all equal. 

So now suppose that the statement is true for expressions with fewer than n 
terms, and consider any two bracketings of ai o • • • o a„. Now for any bracketing, 
when we work it out "from the inside out", in the last step we have just two 
expressions to be composed; that is, the expression looks like 

(xi o • • • OXk) o {Xk+l o • • • ox„). 

There may be further brackets inside the two terms, but (according to the inductive 
hypothesis) they don't affect the result. We will say that the expression splits after 
k terms. 

Suppose that the first expression splits after k terms, and the second splits after 
/ terms. 

Case k = I Both expressions now have the form 

{xio ■ ■ ■ OXk) o {xk+io ■ ■ ■ OXn), 

and by induction the bracketed terms don't depend on any further brackets. So 
they are equal. 
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Case k < I Now the first expression is 

{xio ■ ■ ■ oxk) o {Xk+io ■ ■ ■ oxn) 

and the second is 

{xio---oxi)o{xi+io---oXn). 

By the induction hypothesis, the value of the term x^o ■■■ oxj^ doesn't depend on 
where the brackets are; so we can rearrange the brackets so that this expression 
splits after k terms, so that the whole expression is 

((xi o • • • OXk) o {xk+1 o • • • oxi)) o o • • • ox„). 

In the same way, we can rearrange the second expression as 

{XIO--- OXk) O {{Xk+1 O • • • OXl) O {Xl+l O • • • OXn)). 

Now the two expressions are of the form {aob)oc and ao(boc), where 

a = xio---oxk, 
b = Xk+io---oxu 
c = X/+io---ox„. 

The associative law shows that they are equal. 

Case k > I This case is almost identical to the preceding one. 

Exercises 

8.1 Let nZ be the set of all integers divisible by n. Show that nZ is a ring (with the 
usual addition and multiplication). Is it commutative? Does it have an identity? 

8.2 Let ^(5) denote the set of all subsets of the set 5. For A, 5 e ^(5), define 
A + B = AAB (symmetric difference), and AB =Ar\B (intersection). Show that 
^{S) is a ring. Show also that A^=A for all Ae^{S). 

8.3 Let i? be a ring in which — a for all a ER.By considering (a + b)^, show 

(a) R is commutative; 

(b) a + a = for all a E R. 

(Such a ring is called a Boolean ring.) 
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Chapter 9 
Groups 



The additive and multiplicative axioms for rings are very similar. This similarity 
suggests considering a structure with a single operation, called a group. In this 
section we study groups and their properties. 

9.1 Definition 

A group is a set G with an operation o on G satisfying the following axioms: 

(GO) Closure law: for all a,b EG, we have aob EG. 

(Gl) Associative law: for all a,b,c E G, we have ao (b o c) = {ao b) o c. 

(G2) Identity law: there is an element e EG (called the identity) such that aoe = 
eoa = afor any a EG. 

(G3) Inverse law: for all a E G, there exists b E G such that aob = bo a = e, 
where e is the identity. The element b is called the inverse of a, written a'. 

If in addition the following law holds: 

(G4) Commutative law: for all a, & e G we have aob = boa 

then G is called a commutative group, or more usually an abelian group (after the 
Norwegian mathematician Niels Abel). 

9.2 Elementary properties 

Many of the simple properties work in the same way as for rings. 
Proposition 9.1 Let Gbea group. 
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(a) The composition ofn elements has the same value however the brackets are 
inserted. 

(b) The identity of G is unique. 

(c) Each element has a unique inverse. 

(d) Cancellation law; if a ob = aoc then b = c. 

Proof (a) Proved in the appendix to the last section of the notes, (b) If e and e* 
are identities then 

* * 
e = eoe = e . 

(c) If b and b* are inverses of a then 

b = boe = boaob* = eob* = b*. 

(d) If ab — ac, multiply on the left by the inverse of a to get b = c. 

9.3 Examples of groups 

We have some ready-made examples. 

• Let Rhe a ring. Take G — R, with operation +; the identity is and the 
inverse of a is —a. This group is called the additive group of the ring R. It 
is an abeUan group. 

• Let i? be a ring with identity, and let U (R) denote the set of units of R, with 
operation multiplication in R. This is a group; 

- the closure, identity and inverse laws follow from Proposition xx in 
the last part of the notes; 

- the associative law follows from the ring axiom (Ml). 

This group is called the group of units of R. The next couple of examples 
are special cases. 

• In particular, if F is a field, then the group U (F) of units of F consists of all 
the non-zero elements of F. This is called the multiplicative group of F. 
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• Let F be a field and n a positive integer. The set Mnxn{F) of all n x n 
matrices with elements in F is a ring. We saw that a matrix is a unit in this 
ring if and only if its determinant is non-zero. The group U{Mnxn{F)) is 
called the general linear group of dimension n over F, written GL(n,F). 

• Let y be a vector space. Then, with the operation of vector addition, V is an 
abelian group; the identity is the zero vector 0, and the inverse of v is —v. 

We will meet another very important class of groups in the next chapter. 

Remark on notation I have used here a neutral symbol o for the group opera- 
tion. In books, you will often see the group operation written as multiplication, or 
(in abelian groups) as addition. Here is a table comparing the different notations. 



Notation 


Operation 


Identity 


Inverse 


General 


aob 


e 


a' 


Multiplicative 


ab, a-b 


1 




Additive 


a + b 





—a 



In order to specify the notation, instead of saying, "Let G be a group", we often 
say, "Let (G, o) (or (G, +), or (G, •)) be a group". The rest of the notation should 
then be fixed as in the table. 

Sometimes, however, the notations get a bit mixed up. For example, even with 
the general notation, it is common to use a"^ instead of a' for the inverse of a. I 
will do so from now on. 

9.4 Cayley tables 

If a group is finite, it can be represented by its operation table. In the case of 
groups, this table is more usually called the Cayley table, after Arthur Cayley who 
pioneered its use. Here, for example, is the Cayley table of the group of units of 
the ring Z12. 





1 


5 


7 


11 


1 


1 


5 


7 


11 


5 


5 


1 


11 


7 


7 


7 


11 


1 


5 


11 


11 


7 


5 


1 



Notice that, like the solution to a Sudoku puzzle, the Cayley table of a group 
contains each symbol exactly once in each row and once in each column (ignoring 
row and column labels). Why? Suppose we are looking for the element b in row a. 
It occurs in column x\iaox — b. This equation has the unique solution x — a^^ob, 
where is the inverse of a. A similar argument appUes to the columns. 
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Example Let G be a group with three elements e,a,b, with e the identity. We 
know part of the Cayley table: 



o 


e 


a b 


e 


e 


a b 


a 


a 




b 


b 





Now consider aob, the element in the second row and third column. This cannot 
be a, since we already have a in the row; and it cannot be b, since we already have 
b in the column. So aob = e. With similar arguments we can find all the other 

entries. 

So there is only one "type" of group with three elements. 

We will just stop and look at what this means. Let (G, o) and (//,*) be groups. 
We say that G and H are isomorphic if there is a bijective (one-to-one and onto) 
function F : G ^ H such that 0^2) = F{gi) *F{g2) for all ^1,^2 G G. In 
other words, we can match elements of G with elements of H such that the group 
operation works in the same way on elements of G and the matched elements of 
H. The function F is called an isomorphism. 

Thus, the argument we just gave shows that any two groups with three ele- 
ments are isomorphic. 

9.5 Subgroups 

Let (G, o) be a group, and H a subset of G, that is, a selection of some of the 
elements of G. For example, let G = (Z, +) (the additive group of integers), and 
// = 4Z (the set of multiples of 4). 

We say that H is subgroup of G if H, with the same operation (addition in our 
example) is itself a group. 

How do we decide if a subset H is a subgroup? It has to satisfy the group 
axioms. 

(GO) We require that, for allhi,h2 E H, we have h\oh2&H. 

(Gl) H should satisfy the associative law; that is, {hi o h2) oh^ — hio {h2 o h^,, 
for all /ii, /i2, /i3 e H. But since this equation holds for any choice of three 
elements of G, it is certainly true if the elements belong to H. 

(G2) H must contain an identity element. But, by the uniqueness of the identity, 
this must be the same as the identity element of G. So this condition requires 
that H should contain the identity of G. 
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(G3) Each element of H must have an inverse. Again by the uniqueness, this 
must be the same as the inverse in G. So the condition is that, for any 
heH, its inverse belongs to H. 

So we get one axiom for free and have three to check. But the amount of work 
can be reduced. The next result is called the Subgroup Test. 

Proposition 9.2 A non-empty subset H of a group (G, o) is a subgroup if and only 
if, for all /ii , /i2 e H, we have hioh^^ e 

Proof If // is a subgroup and hi, hi G H, then ^ G //, and so h^ohj^ G H. 

Conversely suppose this condition holds. Since H is non-empty, we can choose 
some element h E H. Taking hi = h2 = h, we find that e = hoh''^ EH; so (G2) 
holds. Now, for any h E H,'we have h"^ = eoh"^ EH; so (G3) holds. Then for 
any hi.hi G H, we have ^ G H, so h\oh2 = h\o [h2^)~^ E H; so (GO) holds. 
As we saw, we get (Gl) for free. 

In our example, G = Z, H = 4Z, take two elements of H, say 4a and 4b; then 
since the group operation is +, the inverse of 4b is —4b, and we have to check 
whether 4a — 4bE H. The answer is yes, since 4a — 4b = 4{a — b)E 4Z. So 4Z is 
a subgroup. 

9.6 Cosets and Lagrange's Theorem 

In our example above, we saw that 4Z is a subgroup of Z. Now Z can be parti- 
tioned into four congruence classes mod 4, one of which is the subgroup 4Z. We 

now generalise this to any group and any subgroup. 

Let G be a group and H a subgroup of G. Define a relation ~ on G by 

gl ~ g2 if and only iigiog:^^ e H. 
We claim that ~ is an equivalence relation, 
reflexive: g\og~^ = eEH,?,o g\^ g\. 

symmetric: Let gi ~ g2, so that h = g2°g\^ G H. Then h~^ = gi ^ G H, so 

transitive: Suppose that ^1 ~ g2 and g2 ~ gs. Then h = g20 g^^ EH and k = 
g-iog^^ EH. Then 

koh={g^og-^)o{g20g-^) = g^og-^ EH, 



so gl ~ g3. 
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Now since we have an equivalence relation on G, the set G is partitioned into 
equivalence classes for the relation. These equivalence classes are called cosets 
of H in G, and the number of equivalence classes is the index of H in G, written 
\G:H\. 

What do cosets look like? 

For any g E G, let 

Hog = {hog:heH}. 

We claim that any coset has this form. Take g E G, and let X be the equivalence 
class of ~ containing g. That is, Z = {x e G;g ~ x}. 

• Take x eX. Then g ^ x, so xo g^^ e H. Let h — xog^K Then x — hog e 
Hog. 

• Take an element of Hog, say hog. Then {hog)og~^ = heH, so g^ hog; 
thus hog ex. 

So every equivalence class is of the form Hog. We have shown: 

Theorem 9.3 Let H be a subgroup ofG. Then the cosets ofH in G are the sets of 
the form- 
Hog^ {heg:heH} 

and they form a partition ofG. 

Example Let G — Z and H — 4Z. Since the group operation is +, the cosets of 
H are the sets H + a for a E G, that is, the congruence classes. There are four of 
them, so \G : H\ =4. 

Remark We write the coset as Hog, and call the element g the coset represen- 
tative. But any element of the coset can be used as its representative. In the above 
example, 

4Z + 1 =4Z + 5 =4Z-7 = 4Z+ 100001 = ••• 

If G is finite, the order of G is the number of elements of G. (If G is infinite, 
we sometimes say that it has infinite order.) We write the order of G as |G|. 

Now the partition into cosets allows us to prove an important result, La- 
grange's Theorem: 

Theorem 9.4 Let G be a finite group, and H a subgroup ofG. Then \H\ divides 
\G\. The quotient \G\/\H\ is equal to \G:H\, the index ofH in G. 
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Proof We know that G is partitioned into the cosets of H. If we can show that 
each coset has the same number as elements as H does, then it will follow that the 
number of cosets is |G|/|//|, and the theorem will be proved. 

So let // o g be a coset of H. We define a function o g by the rule 

that f{h) — hog. We show that / is one-to-one and onto. Then the conclusion 
that \Hog\ = \H\ will follow. 

/ is one-to-one: suppose that f{hi) — /(/i2)> that is, h\o g — g. By the 
Cancellation Law, hi = h2. 

f is onto: take an element x^Hog, say x = hog. Then x = f{h), as required. 



9.7 Orders of elements 

Remember that the order of a group is the number of elements in the group. We 
will define in this section the order of an element of a group. This is quite different 
- be careful not to get them confused - but there is a connection, as we will see. 

Let g be an element of a group G. We define g" for every integer n in the 
following way: 

g''"^ogforn>0, 
ig")-^ for n>0. 

Now it is possible to prove that the exponent laws hold: 
Proposition 9.5 For any integers m and n, 

(a) g"^og- = g'^+\ 

(b) {g'^r=g'^\ 



8' = 
= 



8 



The proof is not difficult but needs a lot of care. It follows from the definition 
that 

„_ Jgo---og(« factors) if n > 0, 

^ \ ° • • • ° (—n factors) if n < 0. 

Now consider There are four cases. 
• If m and n are both positive then 

g"'og" = go • • • og (m-Fn factors) = 
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• If one of m and n is positive, say m > 0, n < 0, then 

- If m + n > 0, so that m > —n, then —n of the factors g cancel all the 
factors g~^, leaving m + n factors g, so the result is 

- If m + n < 0, then m of the factors g~ ^ cancel all the factors g, leaving 
—m — n factors g"^; again we have 

• Finally, if m and n are both negative, a similar argument to the first case 
applies. 

If one of m and n is zero, say m = 0, then the product is e o = g". 
The argument for the second exponent law is similar. 

It follows from the second exponent law that = This also follows 

because ^" o = = e. 

Now we make two definitions. 

• The order of the element g is the smallest positive number n for which 

= e, if such a number exists; if no positive power of g is equal to e, we 
say that g has infinite order. 

• The subgroup generated by g is the set 

{/:neZ} 
of all powers of g. We write it as (g). 

It is not clear from what has been said so far that "the subgroup generated by 
g" is actually a subgroup! In fact it is; this and more are contained in the next 
Proposition. Remember that the word "order" has two different meanings; the 
first is the number of elements in the subgroup, the second is the number we have 
just defined. 

Proposition 9.6 For any element g of a group G, the set (g) is a subgroup ofG, 
and its order is equal to the order of g. 

Proof To show that (g) is a subgroup, we apply the Subgroup Test. Take two 
elements of this set, say g"* and g". Then 

g"'o{g")-'^g'-og-^g'-"e{g). 

Next we show that, if g has order n, then 

• g"* = e if and only if n divides m; 
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• 8^ = 8^ if ^iid only if k =„ /. 

Suppose that m — nq. Then = (^")^ — e'^ — e. Conversely, suppose that g"^ — e. 
By the Division Rule, m = nq + r, with < r < n — 1 . Now g'^ = g"^ = e, so g^ = e. 
But n is the smallest positive integer such that the nth power of g is e; since r <n 
we must have r = 0, and n divides m. 

Now g'^ = g^ if and only if g^^^ = e. By the preceding paragraph, this holds if 
and only if n divides / — k, that is, if and only if k =„ /. 

We see that if g has order n, then the set {g) contains just n elements (one for 
each congruence class mod n), so it is a subgroup of order n. 

Similarly, if g has infinite order, then all the elements of (g) are distinct (since 
if g'^ = g^ then = e), so (g) is an infinite subgroup. 

Corollary 9.7 Let g be an element in a finite group of order n. Then g" = e. 

Proof The order of g cannot be infinite, since {g) is a finite set in this case. Sup- 
pose the order of g is m. Then the order of the subgroup {g) is m. By Lagrange's 
Theorem, m divides n— \G\. 

Now we can revisit Fermat's Little Theorem and prove a stronger version. 

Proposition 9.8 Let nbe a positive integer, and a an integer such that gcd(a, n) = 
1. Then a'^^"^ =„ 1, where (j) is Euler's totient fiinction. 

Proof Let Un be the group of units of Z„. Then \Un\ = <j>{n), and [a]„ e By 
the preceding corollary, [a'^W]^ = [a]^^") = [1]^; in other words, a^^"^ =„ 1. 

Example There are four units in Z12, namely 1,5,7,11. (We write a instead of 
[a] 12.) By the Corollary, if a is one of these four numbers, then a^ =12 1. In fact, 
in this case a^ =12 1 for each of the four numbers. 

9.8 Cyclic groups 

A group G is a cyclic group if G = (g) for some element g E G. 

The prototypical cyclic group of order n is (Z„,+), while the prototypical 
infinite cyclic group is (Z, +). In each case, the group is generated by the element 
1. 

Proposition 9.9 Any two cyclic groups of the same order are isomorphic. 
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Proof We show that a cycUc group of order n is isomorphic to Z„, while an 
infinite cyclic group is isomorphic to Z. 

Let G = (g) be a cyclic group of order n. We saw in the last section that the 
element g has order n, and that g^ — g^ if and only if k =„ /. Now the map [k]n ^ g^ 
is well-defined and is one-to-one and onto, that is, a bijection, from Z„ to G; and 
it is an isomorphism, since 

g^og^ ^g"" ^k + l=nm. 
The proof for infinite groups is even simpler and is left to you. 

Exercises 

9.1 Show that, ifboa = coa, then b = c. 

9.2 Let G be a group of order n. Show that G is a cyclic group if and only if G 
contains an element whose order is n. Hence show that any group of prime order 
is cyclic. 

9.3 Let G be a group of order 4; say G = {e,a,Z7,c}, where e is the identity. 
Suppose that G is not a cyclic group. 

(a) Show that a^^b^ = c^ = e. 

(b) Determine the Cayley table of G. 

(c) Show that G is abelian. 



Chapter 10 
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We have seen rings and groups whose elements are numbers, polynomials, matri- 
ces, and sets. In this chapter we meet another type of object: permutations. The 
operation on permutations is composition, and we construct groups of permuta- 
tions which play and important role in general group theory. 

10.1 Definition and representation 

A permutation of a set Z is a function f -.X which is a bijection (one-to-one 
and onto). 

In this section we consider only the case when Z is a finite set, and we take 
Z to be the set { 1 , 2, . . . , n} for convenience. As an example of a permutation, we 
will take n = S and let / be the function which maps 1 1— > 4, 2 1-^ 7, 3 i— > 3, 4 1-^ 8, 
5i— »l,6i-^5, 7(^2, and 8 i-^ 6. 

We can represent a permutation in two-line notation. We write a matrix with 
two rows and n columns. In the first row we put the numbers 1 , . . . , 8; under each 
number x we put its image under the permutation /. In our example, we have 



How many permutations of the set { 1 , . . . , n} are there? We can ask this ques- 
tion another way? How many matrices are there with two rows and n columns, 
such that the first row has the numbers 1 , . . . ,n in order, and the second contains 
these n numbers in an arbitrary order? There are n choices for the first element in 
the second row; then n — I choices for the second element (since we can't re-use 
the element in the first column); then n — 2 for the third; and so on until the last 
place, where the one remaining number has to be put. So altogether the number 




12 3 4 
4 7 3 8 



5 6 7 8 
15 2 6 




89 



90 



CHAPTER 10. PERMUTATIONS 



of permutations is 

n-{n-l)-{n-2)---l. 

This number is called n\ (read "n factorial" or "factorial n"), the product of the 
natural numbers from 1 to n. Thus we have proved: 

Proposition 10.1 The number of permutations of the set {I, ... ,n} is n\ . 

10.2 The symmetric group 

Let /i and /2 be permutations. We define the composition of /i and /2 to be the 
permutation obtained by applying /i and then /2. 

Warning If you write the image of x under the permutation / as f{x), then the 
composition of /i and /2 maps x to /2(/i (x)) - note the reversal! In order to make 

the notation work better, we change the way we write the image of x under / by 
putting / on the right, as xf (or sometimes up in the air, as x-^). Then we have 
jc(/i 0/2) = {xfi)f2, which is easier to remember. 

You should be aware, though, that some people choose to resolve the problem 
the other way, by defining the composition of fi and /2 to be "first /2, then /i". 

In practice, how do we compose permutations? (Practice is the right word 
here: you should practise composing permutations until you can do it without 
stopping to think.) Let / be the permutation we used as an example in the last 
section, and let 

_ /I 234567 8\ 
^"^3 2 1 8 7 65 47" 

The easiest way to calculate fog is to take each of the numbers 1, . . . , 8, map it 
by /, map the result by g, and write down the result to get the bottom row of the 
two-line form for fog. Thus, / maps 1 to 4, and g maps 4 to 8; so / og maps 1 to 
8; / maps 2 to 7, and g maps 7 to 5, so /o_g maps 2 to 5; and so on. 

Another way to do it is to re-write the two-line form for g by shuffling the 
columns around so that the first row agrees with the second row of /. Then the 
second row will be the second row of fog. Thus, 

_fl 2 3 4 5 6 7 8\ _ A 7 3 8 1 5 2 6Y 
^"1^3 2 1 8 7 6 5 4y)~\^8 5 1 4 3 7 2 6^' 

so 

_ /I 234567 8\ 
5 1 4 3 7 2 67" 
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To see what is going on, remember that a permutation is a function, which can 
be thought of as a black box. The black box for / o ^ is a composite containing 
the black boxes for / and g with the output of the first connected to the input of 
the second: 













/ 




8 















Now to calculate the result of applying / o g to 1, we feed 1 into the input; the 
first black box outputs 4, which is input to the second black box, which outputs 8. 

We define a special permutation, the identity permutation, which leaves every- 
thing where it is: 

_ /I 234567 8\ 
^"^1 2345 67 8^' 

Then we have eof — foe — ffor any permutation /. 

Given a permutation /, we define the inverse permutation of / to be the per- 
mutation which "puts everything back where it came from" - thus, if / maps jc to 
y, then maps y to x. (This is just the inverse function as we defined it before.) 
It can be calculated directly from this rule. Another method is to take the two- 
line form for /, shuffle the columns so that the bottom row is 1 2 ... n, and then 
interchanging the top and bottom rows. For our example, 

/I 2 3 4 5 6 7 8\_/5 7 3 1 6 8 2 4\ 

^~\4 7 3 8 1 5 2 6j~\l 2 3 4 5 6 7 8^' 

so 

i_/l 2 3 4 5 6 7 8\ 

^ "^5 73 16824^" 

We then see that fof~^ — o f — e. 
Now you will not be surprised to learn: 

Theorem 10.2 The set of all permutations of {I,. . . ,n}, with the operation of 
composition, is a group. 

Proof The composition of two permutations is a permutation. The identity and 
inverse laws have just been verified above. So all we have to worry about is the 
associative law. We have 



x{fo{goh)) = {xf){goh) = {{{xf)g)h) = {x{fog))h = x{{fog)oh) 
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for all x; so fo{goh) = (fog) o h, the associative law. 

(Essentially, this last argument shows that the result of applying fogoh, 
bracketed in any fashion, is "/, then g, then h".) 

We call this group the symmetric group of degree n, and write it Sn. Note that 
Sn is a group of order n\ . 

Proposition 10.3 Sn is an abelian group ifn < 2, and is non-abelian ifn > 3. 

Proof 5i has order 1, and 52 has order 2; it is easy to check that these groups are 
abelian, for example by writing down their Cayley tables. 

For n>3,Sn contains elements / and g, where / interchanges 1 and 2 and 
fixes 3, . . . , n, and g interchanges 2 and 3 and fixed 1, 4 . . . , n. Now check that 
f°8¥'8°f- (For example, fog maps 1 to 3, but gof maps 1 to 2.) 

10.3 Cycles 

We come now to a way of representing permutations which is more compact than 
the two-line notation described earlier, but (after a bit of practice!) just as easy to 
calculate with: this is cycle notation. 

Let ai , a2, . . . , flyt be distinct numbers chosen from the set { 1 , 2, . . . , n}. The 
cycle (ai , a2, . . . , a/t) denotes the permutation which maps ai t-^ a2, a2 ^ aj,, . . . , 
a^- 1 H- > a^, and i— > ai . If you imagine a\,a2,. ■ ■ ,a]^ written around a circle, then 
the cycle is the permutation where each element moves to the next place round the 
circle. Any number not in the set {ai , . . . , a^} is fixed by this manoeuvre. 

Notice that the same permutation can be written in many different ways as a 
cycle, since we may start at any point: 

{ai,a2,...,ak) = (a2, . . . ,a^,ai) = ••• = {ak,ai,...,ak-i). 

If ((2i, . . . ,(2yt) and {b\,...,bi) are cycles with the property that no element 
lies in both of the sets {^i, . . . and {b\,. . . ,^/}, then we say that the cycles 
are disjoint, and define their product to be the permutation which acts as the first 
cycle on the as, as the second cycle on the bs, and fixes the other elements (if 
any) of {!,...,«}. In a similar way, we define the product of any set of pairwise 
disjoint cycles. 

Theorem 10.4 Any permutation can be written as a product of disjoint cycles. 
The representation is unique, up to the facts that the cycles can be written in any 
order, and each cycle can be started at any point. 
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Proof Our proof is an algorithm to find the cycle decomposition of a permutation. 
We will consider first our standard example: 

/I 2 3 4 5 6 7 8\ 
•^~V4 7 3 8 1 5 26j' 

Now we do the following. Start with the first element, 1. Follow its successive 
images under / until it returns to its starting point: 

/:li-^4i-^8i-^6i-^5i-^l. 

This gives us a cycle (1,4,8,6,5). 

If this cycle contains all the elements of the set {1, . . . ,n}, then stop. Other- 
wise, choose the smallest unused element (in this case 2, and repeat the procedure: 

/ : 2 I— > 7 I— > 2, 

so we have a cycle (2,7) disjoint from the first. 

We are still not finished, since we have not seen the element 3 yet. Now / : 3 — 
3, so (3) is a cycle with a single element. Now we have the cycle decomposition: 

/=(1,4,8,6,5)(2,7)(3). 

The general procedure is the same. Start with the smallest element of the set, 
namely 1, and follow its successive images under / until we return to something 
we have seen before. This can only be 1. For suppose that / : I i-^ ■■ ■ ^ 
cik ^ cis, where 1 < 5 < ^. Then we have Us-if = as = Ukf, contradicting the fact 
that / is one-to-one. So the cycle ends by returning to its starting point. 

Now continue this procedure until all elements have been used up. We cannot 
ever stray into a previous cycle during this procedure. For suppose we start at an 
element b\, and have f : b\ ^ ■ ■ ■ ^ ^ as, where as lies in an earlier cycle. 
Then as before, as-\f = as = b^f, contradicting the fact that / is one-to-one. So 
the cycles we produce really are disjoint. 

The uniqueness is hopefully clear. 

You should practise composing and inverting permutations in disjoint cycle 
notation. Finding the inverse is particularly simple: all we have to do to find 
is to write each cycle of / in reverse order! 

We simplify the notation still further. Any element in a cycle of length 1 is 
fixed by the permutation, and by convention we do not bother writing such cycles. 
So our example permutation could be written simply as / = (1,4,8,6,5)(2,7). 
The fact that 3 is not mentioned means that it is fixed. (You may notice that there 
is a problem with this convention: the identity permutation fixes everything, and 
so would be written just as a blank space! We get around this either by writing 
one cycle ( 1) to represent it, or by just calling it e.) 

Cycle notation makes it easy to get some information about a permutation: 
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Proposition 10.5 The order of a permutation is the least common multiple of the 
lengths of the cycles in its disjoint cycle representation. 

Proof Recall that the order of / is the smallest positive integer n such that /" = e. 
To see what is going on, return to our standard example: 

/=(()1,4,8,6,5)(2,7)(3). 

Now elements in the first cycle return to their starting position after 5 steps, and 
again after 10, 15, ... steps. So, if /" = 1, then n must be a multiple of 5. But 
also the elements 2 and 7 swap places if / is applied an odd number of times, and 
return to their original positions after an even number of steps. So if /" = 1, then 
n must also be even. Hence if /" = 1 then n is a multiple of 10. The point 3 is 
fixed by any number of applications of / so doesn't affect things further. Thus, 
the order of n is a multiple of 10. But f^'^ = e, since applying / ten times takes 
each element back to its starting position; so the order is exactly 10. 

In general, if the cycle lengths are fci , /:2, ■ ■ ■ , then elements of the ith cycle 
are fixed by /" if and only if n is a multiple of ki; so f" = e if and only if n is a 
multiple of all of ^1 , ... , kr, that is, a multiple of lcm(A^i ,...,kr). So this 1cm is the 
order of /. 

10.4 Transpositions 

A transposition is a permutation which swaps two elements / and j and fixes all 
the other elements of { 1 , . . . , n}. In disjoint cycle form, a transposition looks like 

(iJ)- 

Theorem 10.6 Any permutation in Sn can be written as a product of transpo- 
sitions. The number of transpositions occurring in a product equal to a given 
element f is not always the same, but always has the same parity ( even or odd) 
depending on g. 

Proof We begin by observing that 

(1, 2,. ..,«) = (!, 2)(1, 3). ••(!,«). 
For, in the product on the right, 

• 1 is mapped to 2 by the first factor, and remains there afterwards; 

• 2 is mapped to 1 by the first factor, then to 2 by the second, then stays there; 
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• n — 1 is fixed by all factors until the second-last; it is mapped to 1 by the 
second-last factor and then to n by the last; 

• n is fixed by all factors except the last, which takes it to 1. 

So the two permutations are equal. 

Now in exactly the same way, an arbitrary cycle (ai , 02, • • • , (^k) can be written 
as a product of transpositions: 

((3l,<32, •••,%) = («l,a2)(ai,a3)- •• 

Finally, given an arbitrary permutation, write it in disjoint cycle form, and then 
write each cycle as a product of transpositions. 

The statement about parity is harder to prove, and I have put the proof into an 
appendix. 

Our standard example can be written 

/=(1,4,8,6,5)(2,7) = (1,4)(1,8)(1,6)(1,5)(2,7). 

We call a permutation even or odd according as it is a product of an even or 
odd number of transpositions; we call this the parity of /. Notice that a cycle of 
length ^ is a product of ^ — 1 transpositions. So, if the lengths of the cycles of / 
are ^1 , . . . , A:r (including fixed points), then / is the product of 

{ki-l) + ik2-l) + --- + iK-\)=n-r 

transpositions (since the cycle lengths add up to n). In other words, if we define 
c(/) to be the number of cycles in the cycle decomposition of /, then the parity 
of / is the same as the parity of n — c{f). 

Theorem 10.7 Suppose that n>2. Then the set of even permutations in Sfi is a 
subgroup ofS„ having order n\/2 and index 2. 

Proof Let A„ be the set of even permutations in Sn- If fx^fi^A^, then /2~^ has 
the same cycle lengths as /2 (since we just reverse all the cycles), so it is also in 
An. Thus, /i and /j"^ are each products of an even number of transpositions; and 
then so, obviously, is /i 0/2^. By the Subgroup Test, A,j is a subgroup. 

Let ~ be the equivalence relation defined by this subgroup; that is, /i ~ /2 
if and only if /i o/~^ e A„. By considering each of /i and /2 as products of 
transpositions, we see that /i ~ /2 if and only if fi and /2 have the same parity. 
So there are just two cosets of A„. 

By Lagrange's Theorem, 

|A„| = \Sn\/2 = n\/2. 
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The subgroup A„ consisting of even permutations is called the alternating 
group of degree n. 

Example For n = 3, we have |53| =3! = 6, so IA3I =3. The three even per- 
mutations are e, (1,2,3) and (1,3,2); the remaining three permutations are the 
transpositions (1,2), (1,3) and (2,3) form the other coset of A3 in S2,. 

Remark The formula for a 3 x 3 determinant can be expressed as follows. For 
each permutation f e 83, we do the following. Pick the elements in row / and 

column if of the matrix, and multiply them together. That is, choose one term 
from each row and column in all possible ways. Now multiply the product by +1 
if / is an even permutation, and by — 1 if / is an odd permutation. Finally, add up 
these terms for all the permutations. 
For example, if 

(a b c 
I m n 
p q r 

the terms are as follows: 



Permutation Product Sign 



e 


amr 


+ 


(1,2,3) 


bnp 


+ 


(1,3,2) 


clq 


+ 


(1,2) 


Mr 




(1,3) 


cmp 




(2,3) 


anq 





So det(A) = amr + bnp + clq — blr — cmp — anq. 

Now exactly the same procedure defines the determinant of an n x n matrix, 
for any positive integer n. The drawback is that the number of terms needed for an 
nxn determinant is n!, a rapidly growing function; so the work required becomes 
unreasonable very quickly. This is not a practical way to compute determinants; 
but it is as good a definition as any! 

10.5 Even and odd permutations 

In this Appendix, we prove that the parity (even or odd) of a permutation does not 
depend on the way we write it as a product of transpositions. We will give two 
entirely different proofs. 
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First proof 

For this proof, we see what happens when we multiply a permutation by a trans- 
position. We find that the number of cycles changes by 1 (it may increase or de- 
crease). There are two cases, depending on whether the two points transposed lie 
in different cycles or the same cycle of the permutation. So let / be a permutation 
and t a transposition. 

Case 1: Transposing two points in different cycles. We may suppose that / con- 
tains two cycles (ai, . . . ,ak) and (^i, . . . ,^/), and that t = {ai,bi) (this is because 
we can start each of the cycles at any point). Cycles of / not containing points 
moved by t will be unaffected. Now we find 

f o t : ai ^ a2 ^ ■ ■ • ^ ^ bi ^ b2 ^ ■ • • ^ bi ^ ai, 

so the two cycles of / are "stitched together" into a single cycle in fot, and the 
number of cycles decreases by 1. 

Case 2: Transposing two points in the same cycle. This time let (ai , . . . , a^, ...,ak) 
be a cycle of /, and assimie that t = {ai,am), where I <m<k. This time 

fot: ai a2 ■ ■ ■ ^ Urn-i ^ 

SO the single cycle of / is "cut apart" into two cycles. 
Now any permutation / can be written as 

f = hot20---ots, 

where fi, . . . ,/v are transpositions. Let /, be the product of the first i of the trans- 
positions, and consider the quantity n — c{fi), where c(/) denotes the number of 
cycles of / (including fixed points). We start with /o = e, having n fixed points, 
so n — c(/o) = 0. Now, at each step, we multiply by a transposition, so we change 
c{fi) by one, and hence change n — c{fi) by one. So the final value n — c{f) is 
even or odd depending on whether the number s of transpositions is even or odd. 
But n — c{f) is defined just by the cycle decomposition of /, independent of how 
we express it as a product of transpositions. So in any such expression, the parity 
of the number of transpositions will be the same. 
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Second proof 

Let xi,...,Xnhen indeterminates, and consider the function 

F{xi,...,Xn) = Yl{Xj-Xi). 

For example, for n = 3, we have 

F{xi,X2,X3) = (X2-Xi)(x3-Xi)(x3-X2). 

Given a permutation /, we define a new function F-f of the same indetermi- 
nates by applying the permutation / to their indices: 

F^{xi,. . . = Yli^jf-^if)- 
For example, if n = 3 and / = (2, 3), then 

F^^'^\xi,X2,X3) = ix3-Xl){x2-Xl){x2-X3) = -F {xi,X2,X3) . 

The result of applying fi and then /2 to F is just the result of applying o/i 
to F, as you may check. We show that, for any transposition t, we have 

F'{xi,...,Xn) = -F{xi,...,Xn). 

It will follow that, if / is expressed as the product of s transpositions, then 

F^{xi,...,Xn) = {-iyF{xi,...,Xn). 

Since the value of F^ does not depend on which expression as a product of trans- 
positions we use, we see that (—1)'^ must be the same for all such expressions for 
/, and hence the number of transpositions in the product must always have the 
same parity, as required. 

To prove our claim, take the transposition t = {k, I), where k<l, and see what 
it does to F. We look at the bracketed terms {xj—xi) and see what happens to 
them. There are several cases. 

• If {k, 1} n {/, 7} = 0, then the term is unaffected by the permutation t. 

• If i < k, then the terms (xk — Xi) and {x[ —Xi) are interchanged, and there is 
no effect on F. 

• If k < i < I, then the term (xi — xi^) goes to (x,- — x/) = —{xi —Xi), and the 
term (jc; —Xi) goes to (xk—Xi) — —{xi — Xk); the two sign changes cancel 
out. 
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• If i> I, then the terms (x, —Xk) and (x/ — x/) are interchanged, and there is 
no effect on F. 

• Finally, the term (xj — Xi) is mapped to {xi — xj) = —{xj—Xi). 

So the overall effect of t is to introduce one minus sign, and we conclude that 
= —F, as required. 

Exercises 

10.1 Let g = (1,5,4,9,6,3)(7,8) and h = (1,4,3)(6,8,7)(5,9,2) be permuta- 
tions in the symmetric group Sg. Find goh, g^, g~^, and g~^ ohog. Show that h 
and g~^ ohog have the same order. 

10.2 If g and h are elements of any group, show that 

(g-lo/jog)« = g-lo/j«og 
for any integer n, and deduce that h and g'^ohog necessarily have the same order. 

10.3 List the elements of ^4, and say whether each is an even or odd permutation. 
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